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Abstract 


When  computing  eigenvalues  of  symmetric  matrices  and  singular  values  of  general  matrices 
in  finite  precision  arithmetic  we  in  general  only  expect  to  compute  them  with  an  error  bound 
proportional  to  the  product  of  machine  precision  and  the  norm  of  the  matrix.  In  particular, 
we  do  not  expect  to  compute  tiny  eigenvalues  and  singular  values  to  high  relative  accuracy. 
There  are  some  important  classes  of  matrices  where  we  can  do  much  better,  including  bidiag- 
onal  matrices,  scaled  diagonally  dominant  matrices,  and  scaled  diagonally  dominant  definite 
pencils.  These  classes  include  many  graded  matrices,  and  all  symmetric  positive  definite 
matrices  which  can  be  consistently  ordered  (and  thus  all  symmetric  positive  definite  tridiago- 
nal  matrices).  In  particular,  the  singular  values  and  eigenvalues  are  determined  to  high  rela- 
tive precision  independent  of  their  magnitudes,  and  there  are  algorithms  to  compute  them 
this  accurately.  The  eigenvectors  are  also  determined  more  accurately  than  for  general 
matrices,  and  may  be  computed  more  accurately  as  well.  This  work  extends  results  of  Kahan 
and  Demmel  for  bidiagonal  and  tridiagonal  matrices. 

Keywords:  Graded  matrices.  Singular  value  decomposition,  symmetric  eigenproblem,  pertur- 
bation theory,  error  analysis 
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1.  Introdaction 

When  computing  the  eigenvalues  of  symmetric  matrices  and  singular  values  of  general 
matrices  in  finite  precision  arithmetic  one  generally  only  expects  to  compute  them  with  an 
error  bound  /(n)c||A  ||,  where  /(n)  is  a  modestly  growing  function  of  the  matrix  dimension 
n,  e  is  the  machine  precision,  and  \\A  \\  is  the  2-norm  of  the  matrix  A.  This  follows  as  a 
result  of  standard  theorems  which  state: 

(1.1)  A  perturbation  8A  in  the  matrix  A  cannot  change  its  eigenvalues  (singular  values)  by 
more  than  ||5A  |1  [8]. 

(1.2)  The  standard  algorithm  for  computing  eigenvalues  (singular  values)  of  A  computes  the 
exact  eigenvalues  (singular  values)  of  A+5A,  ||8A  ||  :S /(n)€|lA  ||,  where  /(n)  is  a 
modestly  growing  function  of  n  and  c  is  the  machine  precision  [8]. 

These  error  bounds  imply  that  tiny  eigenvalues  and  singular  values  (tiny  compared  to 
||A  II)  cannot  generally  be  computed  to  high  relative  accuracy,  since  the  error  bound 
/(n)e  ||A  II  may  be  much  larger  than  the  desired  quantity.  In  fact,  if  each  matrix  entry  is  unc- 
ertain in  its  least  significant  digits,  the  tiny  eigenvalues  and  singular  values  may  not  even  be 
determined  accurately  by  the  data. 

Sometimes,  however,  the  eigenvalues  and  singular  values  are  determined  much  more 
accurately  than  error  bounds  like  /(n)e||A||  would  indicate.  This  was  shown  for  singular 
values  of  bidiagonal  matrices  in  [6],  where  it  was  proven  that  small  relative  perturbations  in 
the  bidiagonal  entries  only  cause  small  relative  perturbations  in  the  singular  values,  indepen- 
dent of  their  magnitudes.  It  was  also  shown  how  to  compute  all  the  singular  values  to  high 
relative  accuracy.  In  this  paper  we  extend  these  results  to  eigenvalues  of  symmetric  scaled 
diagonally  dominant  matrices  and  scaled  diagonally  dominant  definite  pencils.  (Henceforth 
we  will  abbreviate  "scaled  diagonally  dominant"  by  s.d.d.)  A  symmetric  s.d.d.  matrix  is  any 
matrix  of  the  form  AAA,  where  A  is  symmetric  and  diagonally  dominant  in  the  usual  sense, 
and  A  is  an  arbitrary  nonsingular  diagonal  matrix.  A  pencil  H —  \M  is  s.d.d.  definite  if  H 
and  M  are  symmetric  s.d.d.  and  M  is  positive  definite.  Examples  of  s.d.d.  matrices  arc  the 
"graded"  matrices 
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Note  that  Aq  graded  in  the  usual  sense,  but  not  diagonally  dominant  in  the  usual  sense.  A]  is 
neither  diagonally  dominant  in  the  usual  sense,  nor  graded  in  the  usual  sense,  since  the  diag- 
onal entries  are  positive  and  negative,  and  not  sorted.  Thus  we  see  that  being  s.d.d.  is  a 
strictly  larger  category  than  either  the  usual  diagonal  dominance  or  grading.  In  fact,  the  set 
of  s.d.d.  matrices  includes  all  symmetric  positive  definite  matrices  which  can  be  consistently 
ordered,  a  class  which  includes  all  symmetric  positive  definite  tridiagonal  matrices.  Dense 
matrices  may  be  s.d.d.  as  well. 

Another  example  arises  from  modeling  a  series  of  masses  mi,  .  .  .  ,m„  on  a  line  con- 
nected by  simple,  linear  springs  with  spring  constants  to *»  (the  ends  of  the  extreme 

springs  are  fixed).  The  natural  frequencies  of  vibration  of  this  system  are  the  square  roots 
of  the  eigenvalues  of  the  s.d.d.  definite  pencil  H-\M,  where  M  is  the  diagonal  mass  matrix 
diag(mi,  .  .  .  ,m„)  and  H  is  the  tridiagonal  stiffness  matrix  with  diagonal  ifco  +  'ti,  k^-^k^  ,..., 
t„_,+i,  and  off  diagonal  -*i  ,...,  -fc,-i.  Note  that  the  matrix  M""2ffM~"2^  ^hich  has  the 
same  eigenvalues  as  H  -\M,  is  symmetric  s.d.d. 

In  particular,  we  will  show  that  small  relative  perturbations  in  the  matrix  entries  only 
cause  small  relative  perturbations  in  the  eigenvalues  and  singular  values,  independent  of  their 


magnitudes.  This  is  a  much  tighter  perturbation  bound  than  the  classical  one  provided  by 
(1.1)  above.  Our  proof  of  this  result  generalizes  and  unifies  results  in  [6]  for  bidiagonal 
matrices  alone  and  in  [10]  for  symmetric  tridiagonai  s.d.d.  matrices  alone. 

Given  that  the  matrix  entries  determine  all  eigenvalues  or  singular  values  to  high  rela- 
tive accuracy,  one  would  naturjilly  like  to  compute  them  that  accurately  as  well.  We  present 
algorithms  based  on  bisection  which  attain  this  accuracy;  in  the  case  of  bidiagonal  or  sym- 
metric positive  definite  tridiagonai  matrices  QR  iteration  (suitably  modified)  can  be  shown  to 
attain  high  accuracy  as  well.  It  is  not  yet  known  whether  algorithms  based  on  divide  and  con- 
quer [4,  7,  9]  can  be  made  to  work  in  some  of  these  situations  too. 

One  may  also  ask  if  the  singular  vectors  and  eigenvectors  of  s.d.d.  matrices  and  pencils 
are  determined  any  more  accurately  than  for  general  matrices.  To  state  the  standard  pertur- 
bation bound  for  eigenvectors  of  symmetric  matrices  and  singular  vectors  of  general  matrices, 
we  need  to  define  the  gap:  if  X.,  is  an  eigenvalue  (singular  value)  of  A  then 
gap{\,)  =«  min  |Xi-X..|.    In  other  words,  it  is  the  absolute  distance  between  X,   and   the 

remainder  of  the  spectrum. 

(1.3)  Let  y  be  a  unit  eigenvector  of  A+hA,  a.  =  y^Ay  the  Rayleigh  quotient,  \,  the  eigen- 
value of  A  closest  to  a,  and  r,  its  unit  eigenvector.  Let  9(z,,)')  be  the  acute  angle 
between  y  and  z,.  Then  sin  e(>',2,)  s  4||8A  \\lgap{\i)  [15,  p.  222]. 

In  other  words,  the  error  as  measured  by  the  angle  is  proportional  to  the  reciprocal  of  the 
gap;  if  the  gap  is  small  (X,  is  in  a  cluster  of  eigenvalues),  the  corresponding  eigenvector  is 
poorly  determined.  As  before,  the  standard  algorithms  guarantee  ||8A  ||  s/(n)€(|A||,  so 
eigenvalues  poorly  separated  with  respect  to  \\A  ||  (i.e.  \\A  ||/^ap(X|)  is  large)  will  generally 
not  be  computed  accurately.   Analogous  results  hold  for  singular  vectors  of  general  matrices. 

For  eigenvectors  of  s.d.d.  matrices,  a  stronger  perturbation  theorem  is  true.  Briefly,  we 
can  replace  the  gap  in  (1.3)  with  the  relative  gap,  min  |X,-Xy |/|XjX,|"^.  Thus,  as  long  as  X, 

is  relatively  well  separated  from  its  neighbors,  its  corresponding  eigenvector  is  determined  to 
high  relative  accuracy.  This  is  a  much  stronger  result  than  (1.3),  as  the  following  example 
shows.  Suppose  the  eigenvalues  are  1,  2 -10"'°  and  10~'°.  Then  the  gap  for  the  smallest 
eigenvalue  is  ^ap  (10~'°)  =  10""'°,  but  the  relative  gap  is  .707.  Thus  (1.3)  predicts  a  loss  of  10 
decimal  digits,  whereas  the  finer  analysis  predicts  nearly  full  accuracy. 

We  also  show  that  a  suitable  variation  of  inverse  iteration  can  be  used  to  compute  the 
eigenvectors  to  this  accuracy.  We  conjecture  that  other  methods  based  on  divide  and  conquer 
can  attain  this  accuracy  as  well,  but  this  has  not  been  proven. 

Similar  results  can  be  proven  for  singular  vectors  of  bidiagonal  matrices  and  eigenvec- 
tors of  s.d.d.  definite  pencils;  the  result  for  singular  vectors  partially  settles  an  open  question 
in  [6]. 

The  rest  of  this  paper  is  organized  as  follows.  Section  2  contains  definitions.  Section  3 
discusses  some  simple  generalizations  of  Gershgorin's  theorem  applicable  to  s.d.d.  matrices. 
Section  4  uses  the  minimax  characterization  of  eigenvalues  to  present  simple  perturbation 
lemmas.  In  section  5  this  lemma  is  applied  to  singular  values  and  in  section  6  to  eigenvalues. 
Section  7  discusses  perturbation  theory  for  eigenvectors  and  singular  vectors.  Section  8  shows 
that  the  condition  numbers  for  the  eigenvectors  provide  good  estimates  for  the  reciprocal  of 
the  distance  to  the  nearest  matrix  with  multiple  eigenvalues.  Section  9  discusses  algorithms 
for  the  bidiagonal  singular  value  decomposition,  section  10  discusses  algorithms  for  the  sym- 
metric tridiagonai  eigenproblem,  and  section  11  discusses  algorithms  for  the  dense  symmetric 
eigenproblem  (both  matrices  and  pencils).  The  new  algorithm  for  the  symmetric  positive 
definite  tridiagonai  eigenproblem  will  be  included  in  the  LAPACK  linear  algebra  library  [12]. 
Section  12  summarizes  the  available  algorithms  and  the  current  state  of  reseaich,  and 
discusses  future  work. 


2.  Definitions  and  Basic  Lemmas 

In  this  paper  we  will  deal  exclusively  with  real  (usually  symmetric)  matrices.  Extensions 
to  complex  (usually  Hermitian)  matrices  will  be  obvious.  ||- 1|  wOl  denote  the  2-norm. 

Decompose  the  matrix  A  ^s  A=D  +N  where  D  is  diagonal  and  N  has  a  zero  diagonal. 
We  will  call  a  matrix  A  -^ -diagonally  dominant  with  respect  to  a  norm  |||-|||  if 
infill  ^  -y  min  |D„|,  where  0:s-y<l.  Suppose  that  A  is  -y-diagonally  dominant  with  respect  to 

either  the  1-norm  or  infinity-norm.  Then  the  well  known  Gershgorin's  Theorem  says  that  the 
eigenvalues  of  A  lie  in  the  union  of  the  Gershgorin  disks  fl,,  where  fl,  is  centered  at  D„  and 
has  radius  at  most  -y  |D„  |.  In  particular,  if  some  B,  is  disjoint  from  the  other  disks,  it  contJiins 
exactly  one  eigenvalue  and  D,,  is  an  approximation  to  this  eigenvalue  with  relative  error  at 
most  -y. 

Now  let  A  =D  +N  and  |D„  |=  1  i.e.  A  has  ±  I's  on  the  diagonal.  Let  Aj  and  Ai  be  arbi- 
trary nonsingular  diagonal  matrices.  Then  we  call  H  »»  A1AA2  "^-scaled  diagonally  dominant 
(y-s.d.d.)  with  respect  to  a  norm  |||-|||  if  A  is  -y-diagonally  dominant  with  respect  to  |||-|||. 
Note  that  a  matrix  may  only  be  diagonally  dominant  in  the  scaled  sense,  as  the  following 
example  shows: 

■"  1        10 
10    10000 


A  = 


.1 


Ai  =  A2  = 


0 
100 


H 


A1AA2  = 


Here,  A  is  -/-diagonally  dominant  with  7=  .1  (with  respect  to  the  1-norm,  2-norm  or  infinity- 
norm),  H  is  7-s.d.d.  with  the  same  7,  but  H  is  not  diagonally  dominant  in  the  nonscaled 
sense  for  any  7<1. 

Our  definition  of  scaling  implies  nothing  about  the  monotonicity  of  the  diagonal  entries 
of  H;  for  example  H  is  7-s.d.d.  if 


A  = 


1 

.1 

.1 

.1 

-1 

.1 

.1 

.1 

1 

A,  =  A,  = 


1     0 

0 

1 

10 

.001 

0    100 

0 

,    H  -  A,AA2  = 

10 

-10000 

.1 

0     0 

.01 

.001 

.1 

.0001 

If  /f  is  symmetric  with  positive  diagonal  entries,  being  7-s.d.d.  with  respect  to  the  2- 
norm  is  closely  related  to  another  well  known  property:  consistent  ordering  [16].  Consistent 
ordering  is  defined  as  follows:  Let  A  =I+N  =I  +  L  +  U,  where  L  is  strictly  lower  triangular 
and  U  is  strictly  upper  triangular.  Then  A  is  consistently  ordered  if  the  eigenvalues  of 
a.L  +  a.~^U  are  independent  of  a^tQ.  Now  suppose  there  is  a  permutation  matrix  P  such  that  A 
in  P^HP  =  ^.AA  =  A.(I  +N)A  is  consistently  ordered.  Then  we  claim  T  is  positive  definite  if 
and  only  if  it  is  s.d.d.  To  prove  this,  note  that  by  choosing  a  =  l  and  o=-l,  we  see  the 
eigenvalues  oi  N  =L  +  U  occur  in  ±  pairs,  including  ±  \\N\\  =  ±7.  Now  note  that  H  is  posi- 
tive definite  if  and  only  if  l+N  is  positive  definite  (by  Sylvester's  theorem),  and  that  the 
smallest  eigenvalue  of /-(-// is  l-||/^||=l-7.  Therefore,  the  theorems  in  this  paper  apply  to 
many  matrices  arising  from  discretized  elliptic  partial  differential  equations  [16]. 

We  wUl  call  a  symmetric  pencil  H -\M  -^-scaled  diagonal  dominant  definite  (y-s.d.d. 
definite)  with  respect  to  a  norm  \\\-\\\  if  H  and  M  are  7-s.d.d.  symmetric  with  respect  to  |||- 1|| 
and  M  is  positive  definite.  If  /f  is  positive  definite  as  well,  we  call  H -\M  y-s.d.d.  positive 
definite. 

If  r  is  any  symmetric  matrix,    ||r||  =  max  \\i(T)\  s   |||r|||  for  any  operator  norm  or 

( 

the  Frobenius  norm  |||-|||.  Therefore,  all  the  theorems  in  this  paper  which  are  proven  for 
diagonal  dominance  with  respect  to  the  2-norm  automatically  hold  for  diagonal  dominance 
with  respect  to  any  operator  norm  or  the  Frobenius  norm. 


The  minimax  characterization  of  the  eigenvalues  \i: 
H-\M  {H=H^,M=M^,M  positive  definite)  is  [15] 


£X„    of  a  definite  pencil 


x'^Hx 
X,  =  min  max   — z •  o  ti 

I     =  i 

where  S'  varies  over  all  /-dimensional  subspaces  of  R"  and  x  varies  over  all  unit  vectors  in  S' 
(x  could  vary  over  all  nonzero  vectors,  but  we  will  find  it  convenient  to  restrict  to  unit  vec- 
tors). There  is  an  obvious  simplifications  if  M  is  the  identity  matrix  (the  standard  eigenprob- 
lem). 


3.  Generalizations  of  Gershgorin's  Theorem 

It  turns  out  the  eigenvalues  of  the   s.d.d.   matrix  //  =  A]/4A;   lie   in   Gershgorin  circles 
whose  centers  and  radii  are  both  scaled  by  A1A2: 

Proposition  1:  Let  //  =  AiAA2  be  a  (possibly  nonsymmetric)  -^-s.d.d.  matrix  with  respect  to  the 
infinity-norm  or  1-norm,  with  -y  <  1 .  Then  the  eigenvalues  of  H  lie  in  disks  B,,  where  B,  is  cen- 
tered at  //„  and  has  radius  at  most  -yl//,,  |.  //  B,  is  disjoint  from  the  other  disks,  it  contains 
exactly  one  eigenvalue  and  //,,  is  an  approximation  to  that  eigenvalue  with  relative  error  at  most 
"Y  • 

Proof:  Suppose  without  loss  of  generality  that  H  is  -y-s.d.d.  with  respect  to  the  infinity-norm; 
otherwise  consider  H^.  The  scalar  X.  is  an  eigenvalue  of  H  if  and  only  if  // -  X/  is  singular, 
which  is  in  turn  true  if  and  only  if  A-XAf 'A;  '  is  singular.  Let  x  be  a  right  null  vector  of 
iA-XAf'Af',  and  suppose  Xj  has  absolute  value  at  least  as  large  as  any  other  component  of 
X.   Then  we  may  rearrange  the  equation 

2  Aji^xi,  -  \xJ^ljJ^2.h  =  0 

k 


to  obtain 


^ij.Az.jj  i^jj  +  S  '^.ik  — ) 


**;  ^J 


"  Xk    , 

Since   Ajj^uj^i.jj  ~  ^ji    ^nd    I-7  s    \Ajj  +   ^  ^ik  — I  —  l"'")'.    ^    must   be    in    the    range 

Hu{\--i)  to  //„(l  +  -y).  The  usual  Gershgorin  argument  shows  that  if  this  disk  is  isolated,  it 
contains  exactly  one  eigenvalue.  D 

This  theorem  implies  that  at  least  if  a  Gershgorin  disk  is  isolated  so  that  small  relative 
changes  in  the  matrix  entries  do  not  effect  its  isolation,  then  the  eigenvalue  it  contains  cannot 
change  by  a  factor  of  more  than  (l-i--y)/(l--y).  If  we  assume  H  is  symmetric,  we  need  not 
assume  the  disks  are  isolated  to  obtain  this  result: 

Proposition  2:  Let  H  be  a  -^-s.d.d.  symmetric  matrix  with  respect  to  the  2-norm.  Let  h,  be  its 
diagonal  entries  in  increasing  order  /ijS  •  •  •  ^/i„,  and  X,  its  eigenvalues,  also  in  increasing 
order.  Then 

l--i  S  — !-  <  1-I--V    . 

Proof:  Assume  without  loss  of  generality  that  Hii  =  h,,  by  reordering  the  rows  and  columns  of 
H  if  necessary.  Then  by  (2.1) 

X,  =  min  max  x^Hx  s    max  x^Hx  =    max  x  H^'^x 

lkli  =  l  ||x||  =  l 

where  Sq  is  the  space  spanned  by  the  first  1  standard  basis  vectors,  and  //''^  is  the  leading 
principal  i  by  /  submatrix  of  H.    If  /i,<0  then  X|<(l-"/)/i,  by  simple  norm  inequalities.  If 


h,>0  and  all  h,>0  for  ;£(,  simple  norm  inequalities  again  imply  X,<  (1* -y)/!,.  If  h,>0  and 
some  h,<0  for  j  <  i,  we  also  have  X  <  (]  ^  y)h:  but  we  must  argue  as  follows: 


■t^/f'"i  =  [x[  ,  'xl] 


A,     0 
0     A. 


-/,    0 
0      /, 


+  A' 


r[r, 

A, 

0  ■ 

•^1 

0 

A; 

X2 

<   -||A,;,||=  +    |1A:;:|P  +  7(I|A,;?,||=  +    llA.;.!!^) 


£    (l^y)\\A2X2 


(1  +  7)'', 


Applying  the  same  process  to  -H  yields  the  complementary  inequalities  \,s(l  +  7)/i,  for 
/i,<0  and  X,s(l-7)/i,  for /i,>0.    D 

Finally,  we  may  extend  the  result  to  s.d.d.  symmetric  definite  pencils: 

Proposition  3:  Let  H  —  kM  be  a  symmetric  y-s.d.d.  definite  pencil  with  respect  to  the  2-norm. 
Let  r,  be  the  sorted  ratios  of  diagonal  entries  H,JM„,  where  rjS  •  •  •  Sr„,  and  X,  the  eigen- 
values, also  in  increasing  order.  Then 


1  +  7 


Proof:  Assume  as  in  the  proof  of  Proposition  2  that  r,  =  H„IM„,  by  reordering  rows  and 
columns  if  necessary.  Write  M  =AAA  where  A  is  diagonal  and  A  is  diagonally  dominant  with 
ones  on  its  diagonal.  Then  the  pencil  A~'//A"'-XA  ^  R-\A  has  the  same  eigenvalues  as 
H -\M,  but  now  the  diagonal  entries  R,,  =  r,.  Then 


X  "^  R  X 

X,  =  min   max    -^ —  £    max 


x''Ax 


i«.Sfa 


x'^Rx 
x'^Ax 


=    max 


X  R 


(•)■ 


i^A('>i 


where  Sq  is  the  space  spanned  by  the  first  i  standard  basis  vectors,  and  R'-''  and  A''^  are  the 
leading  orincipal  i  by  i  submatrices  of  R  and  A,  respectively.  Note  that 
1-y  £  X  A"'x  £  l  +  y  for  all  unit  vectors  x,  since  A  equals  the  identity  matrix  plus  a  matrix 
of  norm  at  most  y.  Then  by  Proposition  2,  we  have  X,  s  (1+ 7)/(l- "y)r,  if  r,>0  and 
>^i  =s  (l--y)/(l  +  "y)'',  if  '•,<0.  Applying  the  same  process  to  R  +  kA  yields  the  other  inequali- 
ties. D 


4.  Perturbation  Lemmas  Based  on  the  Minimax  Theorem 

Let  Xi(//,A/)s  ■  ■  ■  :£\,XH,M)   denote  the   eigenvalues   of  the   definite   pencil  H-\^f. 
Given  the  minimax  characterization  in  (2.1),  the  following  lemma  is  simple  to  prove; 

Lemma  1:  Suppose  hH  has  the  property  that  for  all  nonzero  x 


where  0<giSgu-  Then 


81  ^ 


g! 


x'^Hx 


^    gu 


\Xff  +  hH,M) 


X,(«.A/) 

for  alt  i.  In  other  words,  if  the  Rayleigh  quotients  x^{H  +  bH)x  and  x^Hx  differ  by  at  most  a  cer- 
tain factor  for  all  x,  then  the  eigenvalues  of  H  +bH  -\M  and  H  -  \M  differ  by  at  most  that  same 
factor. 

Proof:  Let  X,  =  X,(//,A/)  and  Let  X',  ^  X,(W  +  8//,A/).  We  consider  only  X,>0;  the  case 
X,<0  is  analogous.    Let  the  spaces  Sq  and  S\  satisfy 

\,=  ra&x  x'^HxIx'^Mx     and     \'  =Tn&x  x'^ {H 'rhH)xlx'^ Mx    . 

x€Sb  xfSi 


Then 


and  similarly 


min  max 

S'       xi%' 


x'^jH  +  hH)x 


x'^Mx 


x'^(H  ^hH)x    x'^Hx    ^       . 
<  max  — ^ — -z '—  -7 -  gui^, 

,6Sb  X  Hx  ^^^-^ 


x'^Hx 


X,  =  min  max  —r- 

S'        rfS'      x'h 


max 


x'^Hx  x^iH  +  bH)x 


x'^Mx 


gr'y^'i 


s'     xis-    x^Mx         xis.\   x'{H^hH)x 

completing  the  proof,  n 

There  is  an  obvious  analogous  result  if  if  both  H  and  M  are  perturbed  simultaneously: 
Lemma  2:  Suppose  5//  and  hM  have  the  property  that  for  all  nonzero  x 

— ^T: -  Sum 


giH  =s  — ^^TTT -  SuH     and     g,.„ 

X  Hx 


where  0<giH'^guH  ^"^  0<giM-guM-  Then 

giH  X,(//  +  6//,M  +  5Af) 


x'Mx 

guH 
glM 


gM  X,(W,Af) 

for  all  i. 

Lemma  3:  Let  H  be  symmetric  "^-s-d-d.  with  respect  to  the  2-norm.  Write  H=  AAA  where  A  is 
diagonal  and  A  has  ±l's  on  its  diagonal.  Let  X  be  an  eigenvalue  of  H,  y  the  corresponding  unit 
eigenvector,  andx  =  Ay/\\Ay\\  the  corresponding  unit  eigenvector  of  A  — XA~'.  Then 

l--y  <    \x'''Ax  I  <   1  +  -Y    . 

Proof:  Write  A  =E  +N,  where  E  is  diagonal  with  ±  I's  on  the  diagonal,  A^  has  zero  diagonal, 
and  ||A^||s"y<l.  The  upper  bound  on  |jr'^i4jr|  follows  immediately  from  taking  norms: 
|x^Ax  I  £  ||£||+  llA^II  s  1  +  7.  If  £=/,  then  \x^Ax  \  =  |1  + x^'A'ar  |  S;  l-^y.  Assume  then 
without  loss  of  generality  that 


E  = 


I,      0 


0    -/, 
where  /;  denotes  an  /-by-/  identity  matrix.  We  will  prove  the  theorem  only  for  X<0;  for  the 


positive  X  consider  -H.    Partition 

<  I 


,     A  = 


A,     0 
0     A. 


and     A' 


■Vi 


conformally  with  E.  Then  A.r=XA"-x  may  be  rewritten 

X\  +  A'fx  =  XA  f 'X| 

Solving  the  first  equation  above  for  xj  yields 

xi  =   (XAf'  -  /)"'A'[z    . 
Note  that  XAf-  -  /  is  diagonal  with  diagonal  entries  less  than  -1.    Now 

x'^Ax  =  x'^Ex  +  x'^Nx  =  x\x\   -  xlx^  +  x^A'x  =  2x[xi  -   1  +  x^A'x 
since  xfx]  +  x;X;  =   1.    Combining  the  last  two  displayed  equations,  and  using  the  fact  that 

-x'"A',(XAr'  -  /)~'A'[x  2:  x'"A',(XAr'   -  /)"-A'[x  >  0 
yields 


x'^Ax  =   2x''A',(XAr-  -  /)"'A'[x  +  (x[  ,  x[) 


A'jx 
six 


=  2x^A'i(XAi"-  -  /)"-A'[x  +  xTxlx  +  x^A'i(XAf^  -  /)""'A'[x  -   1 


£  xJA'Jx  +  x'"A'|(XAr-  -  /)"-A'[x  -   1 


=  xlS'lx  +  x[(XAr-  -  /)-'A'[x  -   1 


=   (x[(XAr=  -  /)-'  ,x[)A'x  -   1 


(XAf^  -  /)-'x, 

*2 

^1 
^2 

\\-\\^^\\-    1 

A'x      -   1 


■Y  -   1 


In  the  next  two  sections  we  will  use  these  results  to  derive  perturbation  theorems  for 
eigenvalues  and  singular  values. 


5.  A  Perturbation  Theorem  for  Singular  Values. 

Using  Lemma  2,  we  prove  the  following  theorem,  which  is  a  mild  strengthing  of  a 
result  of  Kahan  [6]: 

Theorem  1:  Let  B  be  an  n  by  n  bidiagonal  matrix: 

fli    bi 

B  = 

b„-. 


We  assume  the  a,  and  ft,  are  nonzero  since  otherwise  B  splits  into  independent  subproblems.  Let 
B  +  hB  be  a  perturbed  bidiagonal  matrix  with  entries  a,ai  in  place  of  aj  and  Pjft,  in  place  of  bj. 
Then  the  singular  values  ajS  •  •  •  Scr„  o/fl  and  cr'i^  •  •  •  Sa'„  satisfy 

a', 


where  gi  and  gu  are  defined  as  follows.  Define  the  finite  set  S  of  positive  numbers  by 

3;    -    •    •    Pt       ,  . ,^  H       .     ,       fl       "i    ■    ■    ■   *** 


{|- 


:ls;£i£n-l}    U    {| 


3t- 


1    :  Isjsiksn} 


a;  +  i  •  •  •  at  •  3;  •  •  •  pt_i 

Note    that  S   contains    \^j\,    lS;<n,    and    \aj\,    ISjSn.     Let  min  S   and  max  5   denote    the 
minimum  and  maximum  entries  of  S,  respectively.  Then 


gu  =  max  S     and     g/  =  min  S    . 

Corollary  1:  Let  B  and  B+bB  be  bidiagonal  with  singular  values  o-i(S)s  • 
CTi(S +8fl)s  •■  •  So-„(fl  +  8S)        respectively.         If      for        all        nonzero 


|(fl  +  6fl)/y/fl,;|  S  J  for  some  7^1,  then 


■  S(T„(B)  and 
entries        Bij, 


g,(B+8g) 
<x,(fl) 


,2n-l 


Thus,  relative  perturbations  of  at  most  r  in  the  entries  of  B  cause  relative  perturbations  of  at 
most  T^"'^  in  its  singular  values.    //T=l+e  is  close  to  1 ,  so  is  t^''~'=1+(2«  -1)c. 

Corollary  2:  Let  B  and  B+hB  be  bidiagonal.  If  |(S  +  Sfl)y|  =  r\B,j\  for  all  i  and  j,  then 
<Ti(B  +5fl)  =  Ta,(S).  This  simply  says  that  if  you  multiply  a  matrix  by  a  constant  t,  you  multi- 
ply all  its  singular  values  by  t  as  well. 

Proof  of  Theorem  1:  For  notational  convenience  rename  the  entries  of  B  so  that 


B  = 


and  so  that  B  +  hB  has  entries  7,j,  in  place  of  j,.  We  may  assume  without  loss  of  generality 
that  all  the  j,  are  real  and  positive,  since  this  may  be  achieved  by  pre-  and  postmultiplying  B 
by  unitary  diagonal  matrices.  We  may  also  assume  the  -y,  are  real  and  positive  for  the  same 
reason.   We  also  use  the  well  known  fact  that  the  eigenvalues  of 


•^2 

^3     J4 

■     ^2n-2 

J2-.-1 

c  = 


0  b'" 

B     0 


are  plus  and  minus  the  singular  values  of  B.   Furthermore,   by  reordering  the  rows   and 


columns  of  C  in  the  order  1,  n  + 1,  2,  «  +2,  .  .  .  ,  n.  In,  we  see  C  is  similar  to 

0      5, 

^ln-\  0 


E  = 


Thus,  the  singular  values  of  B  are  the  absolute  values  of  the  eigenvalues  of  the  pencil 
E  —  \I.  We  are  interested  in  comparing  these  eigenvalues  with  the  eigenvalues  of  E  +  bE  —  \I, 
where  E  +  bE  has  entries  "y,j,  in  place  of  j,.  We  will  chose  u  diagonal  matrix  D  such  that 
D  (E  +&E)D  =  E,  so  that  these  eigenvalues  will  be  the  eigenvalues  of  the  pencil  E  —  \D^\  we 
will  then  apply  Lemma  2  to  conclude  that  the  eigenvalues  will  change  at  most  by  a  factor  in 

the  range  miii£>,7^  to  maxD,7^. 

I  1 

Actually,  we  will  find  two  diagonal  matrices  O^  and  D;  satisfying  D(E  +&E)D  =  E  but 
corresponding   to    different  choices   of  D^    in   order   to   minimize   max  D;„   and  maximize 

min  Du„;  this  will  in  turn  maximize  gt  and  minimize  g„  where  (by  Lemma  2) 


gi 


1 


1 


max  DJii 


min  DI,, 


gu 


First  consider  D,.  We  want  to  choose  Dm  to  minimize  max  D,„.  As  above,  the  diagonal 

entries  of  D/  must  satisfy  the  following  recurrence:  D;,  +  i  ,  +  i  =  (7,D(„)~'.  This  means  the 
diagonal  entries  of  D;  are  either  of  the  form 


D,n- 


7173 


yv-\ 


or 


7274 


7274 


72; 


72; 


'ni7i     7375 


72;  + I 


(5.1) 


(5.2) 


Choosing  0,^  to  minimize  the  maximum  of  these  terms  is  equivalent  to  choosing  Dm  to 
minimize  max(j,Dni  ,  ^2^/11),  where  i,  is  the  maximum  coefficient  of  Dm  in  (5.1)  and  ^2  is 
the  maximum  coefficient  of  Df,j  in  (5.2);  this  minimum  is  easily  seen  to  be  (jiJ2)"^-  Some 
tedious  algebraic  manipulation  leads  to  the  expression  for  gi  in  the  statement  of  the  theorem. 
Choosing  D^^  to  maximize  min  D^,,  is  analogous.  □ 

A  similar  proof  yields  the  following  analogous  result:  Let  A  be  an  arbitrary  matrix,  and 
Di  and  D2  arbitrary  nonsingular  diagonal  matrices,  which  we  may  assume  are  real  and  non- 
negative.  Then  the  singular  values  ct,  of  A  and  a' ,  of  D^ADi  satisfy 


_                            cr  , 
mm  Di^-mm  D2J  ^  s  max  Di,max  D 
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6.  Perturbation  Theorems  for  Eigenvalues 

In  this  section  we  use  the  perturbation  lemma  of  section  3  to  prove  perturbation 
theorems  for  eigenvalues  of  symmetric  ^-s.d.d.  matrices  and  -y-s.d.d.  positive  definite  pen- 
cils. Theorems  2  and  3  apply  only  to  positive  definite  matrices  and  pencils,  and  are  stronger 
than  Proposition  4  and  Theorem  4  which  apply  to  indefinite  matrices  as  well.  In  fact,  when 
the  matrix  or  pencil  is  positive  definite,  A  may  be  an  arbitrary  matrix,  not  just  diagonal.  Our 
results  for  definite  pencils  with  both  positive  and  negative  eigenvalues  are  rather  weaker  than 
the  results  for  s.d.d.  indefinite  symmetric  matrices. 

Theorem  2:  Let  H  =  ^A^  =  ^{l  +N)^  be  a  positive  definite  matrix  where  ||A'1]  =  'y<1,  and  A  is 
an  arbitrary  nonsingular  matrix.  (If  ti  is  diagonal  this  is  equivalent  to  H  being  a  symmetric  posi- 
tive definite  -i-s.d.d.  matrix  with  respect  to  the  2-norm.)  Let  hH  be  a  symmetric  perturbation  of 
H  with  ||A"'5//A"' II  =  e  <  I-7.  Then  the  eigenvalues  \i-^  ■  ■  ■  <X„  of  H  and 
X',<  •  •  •  s\'„  ofH  +  bH  satisfy 

1  -   -^—  s   -^  <  1 


1-^         X,  1-^ 

Thus,  if  ^  is  diagonal,  relative  perturbations  of  at  most  €  in  the  entries  of  H  cause  only  relative 
perturbations  of  at  most  ne/(l--y)  in  the  eigenvalues. 

Proof:    We  have 

x^{H^bH)r    ^    x^A(A^A-'5//A-')Ax    ^    y^M  ^  A -' 5//A  "')> 
x^Hx  x'^^A^x  y^Ay 

where  y  =  Ax.  The  values  taken  by  the  quantity  in  (6.1)  as  x  varies  over  all  nonzero  vectors 
are  the  same  as  the  values  taken  on  as  >■  varies  over  all  unit  vectors.  Since  1  -ys.y^Ay  if  y  is  a 
unit  vector, 

^__JL_<    y^(A-fA-'S//A-')y   ^  ^^_^    _ 
1--Y  y'^Ay  I-7 

The  result  now  follows  from  Lemma  1.  o 

Theorem  3:  Let  //  =  A«A//A//=  A//(/ -i-A'w)A//  and  M  =  A_\iAsi^.\!=  £^^{1  +  Nm)^.m  be  positive 
definite  matrices  where  ||A'// ||s-y <  1,  H/V^^  ||<-y<  1,  and  A^  and  Am  are  arbitrary  nonsingular 
matrices.  (If  Ah  and  Am  ore  diagonal  this  is  equivalent  to  H—kM  being  a  y-s.d.d.  positive 
definite  pencil  with  respect  to  the  2-norm.)  Let  hH  and  SA/  be  symmetric  perturbations  of  H  and 
M  such  that  IJA^'SHAw'  ||  £  t  <  l--y  and  ||A^'6A/A,r/'  ||  s  e  <  I-7.  Then  the  eigenvalues 
X.,S  ■  •  •  <\„  ofH-XM  an^  X'i<  •  •  •  <X'„  0/ (W  +  8//)- X(A/ +  8A/)  satisfy 

1-7-6   ^  2^  ^    l-y  +  € 
l-'Y  +  e    ~    X',    "    1-7-e 

Thus,  if  Af]  and  A<,f  are  diagonal,  relative  perturbations  of  at  most  c  in  the  entries  of  H  and  M 
cause  only  relative  perturbations  of  at  most  n€/(l—  7)  in  the  eigenvalues  of  H  —  \M . 

Proof:  Apply  the  technique  in  the  proof  of  Theorem  2  to  both  H  and  M  and  apply  Lemma  2. 
D 
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When  H  is  indefinite,  our  results  are  weaker.  In  the  case  of  H  alone,  we  must  assume  A 
is  diagonal  to  attain  a  bound  like  that  of  Theorem  2: 

Proposition  4:  Let  H  be  an  n-by-n  symmetric  y-s.d.d.  matrix  with  respect  to  the  2-norm.  Thus 
H  =  Ay4  A  where  A  is  diagonal,  A  =  E  ^N  where  E  is  diagonal  with  :r  I' s  on  the  diagonal,  N  has 
a  zero  diagonal  and  ||A'||<'y<1.  Let  bH  be  a  symmetric  perturbation  of  H  with 
||A~'8WA~' II  =  £  <  (1— >)/«.  Assume  also  that  H  +  hH  is  y-s.d.d.  Then  the  eigenvalues 
X,£  ■  •  •  <\„  ofH  and  X.',s  •  ■  ■  SX'„  ofH  +  hH  satisfy 

1-^  X,  1--/ 

Thus,  relative  perturbations  of  at  most  e  in  the  entries  of  H  can  cause  relative  perturbations  of  at 
most  n"e/(l  — -y)  in  the  eigenvalues. 

Proof:  We  will  prove  the  theorem  only  for  the  negative  eigenvalues;  for  the  positive  ones 
consider  —H.  We  cannot  apply  Lerrlma  1  directly  here  because  x^AxIx^x  is  not  bounded 
away  from  0  for  all  x  as  in  the  proof  of  Proposition  4.  By  Lemma  3,  however,  it  is  so 
bounded  if  we  restrict  x  to  lie  in  an  eigenspace  of  /4  — XA~'  corresponding  to  only  negative 
(or  only  positive)  eigenvalues;  this  will  be  sufficient  for  the  proof. 

To  this  end,  let  Xj^  •  •  •  £X„  be  the  eigenvalues  and  xj,  .  .  .  ,x„  the  corresponding  unit 
right  eigenvectors  of  A  -XA"-;  let  the  primed  quantities  X'jS  •  •  •  £X'„  and  jr',  correspond 
to  >1  +A"'8HA"'-XA"'.    By  Lemma  3  x]Ax,-^-i-l<Q  if  X,<0. 

Now  let  X,  =  [x,,  .  .  .  ,x]  and  A,  =  diag{\^,  .  .  .  ,X,),  so  AX,  =  A"-X,A,.  Since  the 
columns  of  A",  are  eigenvectors  of  /I  -  X  A  "^ ,  the  columns  of  A  ~  'X,  are  eigenvectors  of  H  and 
so  are  orthogonal.  Thus 

X]aX,  =  XfA--X,A,  =  dlag{xJ^-"x,\,) 

is  diagonal  with  diagonal  entries  bounded  above  by  -/-L  Let  z  be  an  arbitrary  unit  vector; 
then 

2'^X]aX,z  =  z^diagix]^-'x,\,)z  <  7-1    . 
Now  we  use  the  characterization 


min  max 


x'^Ax 


S'     x(s'    x^A    ■'^x 
where  the  minimum  is  attained  for  S'  =  Sq  =  span{X,).  Then 

,,  .  j:^(A +A"'S//A"')x    ,  x](A+^-^hH^-^)x       x'^  Ax 

X',  =  mm  max  — ^ ^f—^-, ^  max  — ^ = ^ _, 

s'     x(s'  X  A    'X  xjsb  x' Ax  x' A    ■ 


X 


-  ^,v  n  ^  ^'^rA-'5//A-'x,2^     2'x]ax,z 
=    max  (1  +  = — = )•— T — :;: 

li^li  =  i  z^x]aX.z  '   z^XjA-'-X,z 

Now  \z^xjA'^hHA-^X,z\  <  it  and  \z'^xJaX,z\  ^  I-7  so 

X',^(1--^)X,      or      h^^i-Jl- 
Swapping  the  roles  of  A  and  A  +  A"'8//A~'  we  obtain 

1-7  X,  ^  1-7 

as  desired.  □ 
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The  factor  n  in  the  bound  of  Proposition  4  is  an  overestimate,  and  can  be  removed  by 
modifying  the  conditions  of  the  proposition  just  slightly: 

Theorem  4:  Let  H  =  AAA  be  an  n-by-n  symmetric  y-s.d.d.  matrix  uif/i  respect  to  the  2-norm. 
Here  A  is  diagonal  and  A  has  ±Vs  on  the  diagonal.  Let  hH  be  a  symmetric  perturbation  with 
||A"'6WA~' II  -  «•  Assume  that  H  +  ihH  is  y-s.d.d.  for  all  0<4<1.  Then  letting 
X]<  •  •  •  <\„  be  the  eigenvalues  of  H  and  X. ',<••£  X '„   fee  f^f  eigenvalues  ofH  +  hH,  we 


have 


Proof:  Let  E  =  A~'SWA~ '/ ||A"'BHA" '  ||  be  a  matrix  of  norm  1,  H  U)  =  A(A  +££:)A,  and 
X;(4)s  •  •  •  £X,(0  be  the  eigenvalues  of  //(O-  Suppose  first  that  X,(0)  is  simple.  Let  x,  be 
the  unit  eigenvector  corresponding  to  X,(0).  Then  from  standard  eigenvalue  perturbation 
theorem  [11,  16],  we  know 


(6.3) 


Therefore 

Mil  =  1 .  ^4^^  .  oa^)  ^  1 .  c4^  -  oic^) 

X,(0)  ^xfAAAx,  yjAy, 

where  Ijy,  ||  may  be  taken  to  be  one.  By  Lemma  3,  |vM>'/ 1  -  l~'y.  and  so 

Assume  now  that  \Xi)  is  simple  for  all  0<C<€;  then  (6.3)  implies  that 
\d  logX,(;t)/<fx|  :s  (1--y)"'.  Integrating  from  0  to  €  yields  (6.2).  By  [11,  Theorem  II. 6.1],  the 
eigenvalues  are  all  real  analytic,  even  when  they  are  multiple.  Thus,  if  there  are  only  finitely 
many  4  where  X,(0  is  multiple,  we  can  apply  the  above  argument  in  the  intermediate  inter- 
vals. 

It  remains  to  consider  the  case  where  X,(0  is  multiple  for  infinitely  many  C-  Here  we 
argue  that  this  can  only  happen  for  a  set  of  pairs  of  matrices  H  and  £  of  measure  zero,  that 
off  this  set  the  previous  argument  holds,  and  by  continuity  of  the  eigenvalues  the  same 
bounds  must  hold  on  the  set.  To  see  that  the  set  of  H  and  E  such  that  some  X,(C)  is  multiple 
infinitely  often  is  of  measure  zero,  consider  the  discriminant  of  the  characteristic  polynomial 
of  H  -^AEA;  this  is  a  polynomial  in  4  and  the  2n'  entries  of  H  and  E.  H  (C)  can  have  multi- 
ple eigenvalues  if  and  only  if  this  discriminant  vanishes.  If  it  vanishes  for  infinitely  many  J, 
its  coefficients  (viewing  it  as  a  polynomial  in  ^)  must  vanish  identically.  These  coefficient  are 
in  turn  polynomials  in  the  entries  of  H  and  E,  and  so  vanish  only  on  a  proper  variety  (a  set  of 
measure  zero).  Off  this  set,  the  discriminant  has  at  most  a  finite  number  of  zeros  (bounded 
by  its  degree  as  a  polynomial  in  £)■    ^ 

Result  (6.2)  was  claimed  without  proof  in  [10]  just  for  the  case  of  tridiagonal  matrices 
perturbed  on  their  offdiagonals. 

In  light  of  Proposition  3,  it  is  probably  possible  to  prove  an  analogous  theorem  for  7- 
s.d.d.  definite  pencils  which  have  both  positive  and  negative  eigenvalues,  but  we  have  not 
been  able  to  do  so.  However,  the  following  technique  frequently  succeeds  in  reducing  the 
eigenproblem  for  such  pencils  to  a  problem  where  Theorem  4  may  be  applied: 
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Algorithm  1:  Reducing  a  y-s.d.d.  definite  pencil  H  -  \M    to  an  s.d.d.  matrix  Y: 

(1)  Let  D|  =  diag(A/,', -),  and  compute  H^  =  D'^HD'^  and  M  ^  =  D'^MD  "'.  Now  A/,  has  unit 
diagonal  and  is  diagonally  dominant  in  the  usual  sense. 

(2)  Let  P  be  a  permutation  matrix  chosen  so  that  Wi  =  PH  iP^  has  its  diagonally  entries 
sorted  from  smallest  to  largest  in  absolute  value  (smallest  at  the  top  left,  largest  at  the 
bottom  right).  Let  Mi^PM  ^P'^ . 

(3)  Let  L  be  the  lower  triangular  Cholesky  factor  of  A/;.  Let  y  =  L"'//2^~^-  Then  Y  and 
H —\M  have  the  same  eigenvalues. 

This  is  a  variation  on  the  usual  reduction  of  a  definite  pencil  to  standard  form.  The 
point  is  that  if  M  is  sufficiently  diagonally  dominant,  L  will  also  be  diagonally  dominant  with 
nearly  unit  diagonal,  and  the  multiplication  L'^H^L'^  will  not  destroy  the  s.d.d.  property  of 
H^-  Thus,  Y  will  be  s.d.d.  The  following  theorem  formalizes  this,  but  is  weak  in  that  it  only 
guarantees  diagonal  dominance  of  Y  for  rather  small  -y,  much  smaller  than  those  that  work  in 
practice: 

Proposition  5:  Let  H  —  \M  be  an  n  by  n  y-s.d.d.  definite  pencil,  and  let  Y  be  the  output  of  the 
above  reduction  algorithm.  Define 


Then  if 


y'  -  ((2n)'=  +  l)|^^-[2  +   ((2;i)'=+l)V=]-7' 
l--y 


-^    (n  +  l)y'  +  y    ^ 
1-7' 


which  will  be  true  for  y  small  enough,  Y  will  be  y-s.d.d. 

The  proof  of  Proposition  5  requires  the  following  lemma: 

Lemma  4:  Let  M  be  an  n  by  n  y-s.d.d.  matrix  with  unit  diagonal.  Let  L  be  its  lower  triangular 
Cholesky  factor.   Then    \\L  -  I  \\  <  ((2r,)' - -^  l)-y '  -  and    ||Z.-'-/|j  <  ((2n)"- +  Ij^l'^/Cl- •y)"^ 
Also.  (l+>)-'-  <    ||Z.-'||  s  (1-7)"". 
Proof:  Let  X=L-/  and  U'  =  L"'-/  =   -Z.~'X.  Since 

(l--y)''2  <  VJniM)  =  a^,„(L)  <  L„  <  (7^„(Z.)  =   Xi,:L(M)  <  (l  +  yV' 

we  have  \X„\  =  \L„-l\  <  l-(l-7)"2  <  ^"2.  Also,  we  may  bound  the  norm  of  the  i-th 
subdiagonal  column  of  X  as  follows: 

1  +  7  ^   \\[L,,,  ,  L..,.,  ,  •  •  •  ,  L„,]\\'  =    l|[Z.,,,  ,  X,^,,  ,  •  •  ■  ,  X„,,]|p  ^  1-7  +    ||X,  +  ,^,.,|p 

whence  ||X,.,  „,,  ||  s  (27)"2.  Thus  ||X  ||  s  (2n7)"2 +  7"=  as  desired.  Finally, 
||W||^    ||L-M|-||X||^  (l-7)-"^||X||.D 

Proof  of  Proposition  5:  By  applying  the  first  two  steps  of  the  reduction,  assume  without  loss 
of  generality  that  M  has  unit  diagonal  and  that  \H„\  s  |W,  +  i,.,|  for  all  i.  Let  L.  ,  denote 
the  i-th  column  of  L  and  similarly  for  Z.^  .  Also,  let  G"-'^  denote  the  leading  i  by  j  siibmatrix 
ofC.    Let  D  =  diag(|//„  I'-)  and  !-'  =  /  + H'.  Then 

Y.J  =  L-'HL-J  =  H.j  +  W,.HL-J  +  L-^HW  j  +  W^^.HW  j    . 
Therefore 

\Y,j-H,j\  <    \W,^HL~J\  +    \L-^HW  j\  +    \W,,.HW  J 

^  (2||W|H|L-M|  +    ||Wl|2)||//('..'>||  <  (2||W|H|L-'||  +    ||Wip)D„D,,(l+7) 
Now  insert  the  bounds  of  Lemma  4  to  get 
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Finally,  letting  D)  =  diag(  iK,,  |' -) 

||offdiag(Dr'yD>^')||<    ^"\^}]["^ 

follows  from  simple  norm  inequalities.  D 

In  order  to  apply  Proposition  4  or  Theorem  4,  however,  we  must  argue  that  small  rela- 
tive perturbations  in  H  and  M  (of  the  type  permitted  in  Proposition  4  and  Theorem  4)  cause 
small  perturbations  of  the  same  type  in  Y: 

Theorem  5:  Let  H-\M  =  DfjAHDff-kD^,A>.iDM  be  a  y-s.d.d.  definite  pencil,  where  A^  and 
Am  have  ±\s  on  their  diagonals.  Let  Y  =  D)-4)D)'  be  the  output  of  Algorithm  1  applied  to 
H  -KM.  Assume  that  Y  is  y-s.d.d.  (y  may  be  smaller  than  the  expression  in  Proposition  5).  S'ow 
define  H  (t)  =  Dh{Ah+ f-EH)D u.  and  similarly  M  (e)  =  D m{A:,,  + (.E>,,)D s,,  where 
\\Eh\\  =    \\Ey.j\\  =   1.  Let  Y{t)  be  the  output  of  the  reduction  algorithm  applied  to  H{e).  Then 

n(2-n'---f  1)(1+-y) 

and  the  eigenvalues  X,(e)  o/// (c)  -  XA/ («)  satisfy 

_       .(2.n'^^l)(l-,)    ^  ^     :    ^    Mil  ^  1  ^  ,.iLl2Vijain^yi^0(c^-)    . 

For  the  proof  of  Theorem  5  we  require  the  following  lemma: 

Lemma  5:  Let  M  be  a  y-s.d.d.  symmetric  matrix  with  unit  diagonal.  Let  L  be  its  lower  triangu- 
lar Cholesky  factor.  Let  L  +  5L  be  the  lower  triangular  Cholesky  factor  of  the  perturbed  matrix 
M  +  bM.  Then 

,1/2 


\\DyHY{e)-Y)D] 


Oie-) 


bL\\  < 


1-7 


|8A/||  +   OdlSA/ll^)    . 


Proof:  It  suffices  to  consider  M  diagonal,  since  the  Cholesky  factors  of  Af  and  QMQ^  (Q 
orthogonal)  have  the  same  norm.  Equating  first  order  terms  on  both  sides  of 
Af  +  6A/  =  {L  +  bL)iL+hLy  yields  hL,j  =  M'^'^bM^j  ii>j)  and  BL„  =  .5-Af,7'^5A^„  ((=;). 
Taking  norms  yields  the  result.  D 

Proof  of  Theorem  5:  By  applying  the  first  two  steps  of  Algorithm  1,  we  may  assume  without 
loss  of  generality  that  A/„=l  and  |//„  |  s  |W,»,,^,|.  Let  L{(.)  be  the  Cholesky  factor  of 
M{f.).  Then  the  eigenvalues  of  H {e)-\M  (t)  are  the  eigenvalues  of 
y(€)  =  L"'(e)//(e)Z."^(e).  Letting  A/(e)  =  A/ +  8A/,  L(€)  =  L  +  bL,  and  //(e)  =  W  +  8//,  we  get 
that  to  first  order 


+  L~^bHL~'^  -  L~^HL~^bL'^L~^ 


y(£)  =  (Z.-'-L"'8Z.L-')(W  +  8//)(L-'"-L-''5L^L-^) 

=  L'^HL'^  -  L'^bLL'^HL'^ 
Therefore  to  first  order  in  e 

|(y(0-n;|  ^  D,,,D^,,,(2-(l  +  -y)-||L-i|p-n'''-(l-"y)-''^ 


+     L 


•1  112 


)€ 


Da.uDa. 


12£li+iXi+xL., 


as  desired.  Applying  Theorem  4  yields  the  final  result.  □ 
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We  may  apply  Theorem  4  to  analyze  the  convergence  criterion  for  the  QR  algorithm  for 
eigenvalues  of  symmetric  tridiagonal  matrices  [15],  In  the  course  of  running  the  QR  algo- 
rithm on  a  symmetric  tridiagonal  matrix  one  must  examine  the  matrix 


T  = 


^;    «;-i 


and  decide  whether  b:  can  be  set  to  zero  ("convergence")  without  making  unacceptahly  large 
perturbations  in  the  eigenvalues.  Theorem  4  tells  us  that  if  T  is  -y-s.d.d.,  then  setting  bj  to 
zero  makes  relative  errors  no  larger  than 

exp(- ^^^^TT-T-^)  -  1  (6-4) 


^;  ";-i 


1-7 


in  any  eigenvalue.  This  result  is  attractive  because  it  is  inexpensive  and  purely  local;  it  only 
depends  on  bj  and  its  neighbors  aj  and  Oj^i  on  the  diagonal  of  T.  It  does  differ  from  the 
standard  criterion  which  essentially  asks  if 

\bj\ 

|a;l+  k.,-1  I 

is  small;  this  criterion  is  weaker  than  (6.4)  and  does  not  guarantee  high  relative  accuracy. 

Unfortunately,  even  using  (6.4)  as  a  convergence  criterion  does  not  guarantee  that  QR 
will  compute  eigenvalues  with  high  relative  accuracy;  there  are  examples  which  are  even 
fairly  strongly  s.d.d.  of  QR  computing  eigenvalues  with  incorrect  signs,  i.e.  no  relative  accu- 
racy at  all.    We  discuss  this  further  in  Section  10. 


7.  Perturbation  Theorems  for  Eigenvectors 

In  this  section  we  discuss  the  sensitivity  of  the  eigenvectors  of  symmetric  s.d.d.  matrices 
under  the  same  small  perturbations  as  in  section  6.  As  discussed  in  the  introduction,  the  stan- 
dard perturbation  bound  (1.3)  is  proportional  to  the  reciprocal  of  8ap{\,)  =  min  |\,-X,|. 

Thus,  if  the  absolute  distance  from  X,  to  its  nearest  neighbor  is  small,  we  expect  the 
corresponding  eigenvector  to  be  sensitive  to  perturbations.  Our  first  result  will  be  an  analo- 
gous theorem  which  replaces  the  gap  with  the  relative  gap 


relgap  (\,)  =   min 


(7.1) 


which  may  be  large  even  when  the  usual  gap  is  small. 

Even  more  may  be  shown.  Proposition  6  will  show  that  the  eigenvectors  are  scaled 
analogously  to  the  matrix  entries:  if  x,  is  the  eigenvector  for  X,,  and  X^  differs  from  X,  by  a 
large  factor,  the  ;-th  component  of  x,  will  be  small.  Theorem  7  will  show  that  small  relative 
perturbations  in  the  matrix  only  cause  small  perturbations  in  the  eigenvector  entries  relative 
to  their  upper  bounds  of  Proposition  6;  thus  some  tiny  eigenvector  components  may  be  deter- 
mined to  high  relative  accuracy  as  well.  Finally,  we  discuss  eigenvector  bounds  for  definite 
pencils  and  singular  vector  bounds  for  bidiagonal  matrices  (partially  settling  a  conjecture 
from  [6]). 

Perturbation  theory  and  algorithms  for  conventionally  diagonally  dominant  nonsym- 
metric  matrices  were  developed  in  [2],  under  the  assumption  that  the  gap  between  eigen- 
values greatly  exceeded  the  norm  y  of  the  offdiagonal  part.  Thus  these  results  apply  to  gen- 
eral nonsymmetric  matrices,  but  make  much  stronger  diagonal  dominance  assumptions  than 
we  do  and  obtain  weaker  results. 
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Theorem  6:  Let  H  =  \A\  be  a  y-s.d.d.  symmetric  matrix  with  respect  to  the  2-norm.  Let  E  be 
symmetric  and  have  2-norm  one,  and  define  //(6)  =  A(A  ~tE)\.  Let  X,(e)  be  the  i-th  eigenvalue 
of  H  {(.),  and  assume  X,(0)  is  simple  so  that  the  corresponding  unit  eigenvector  x,{f-)  is  ^'ell 
defined  for  sufficiently  small  e.  Then 


(1  -  7)  relgap{K,) 


Proof:  From  [11]  we  have 


Let »  =  Axi,.  Then 


^     xJ.AEAx, 
x,{e)  =  x.(0)  +  e-Z   77 T^^*  +   O(e^) 


x,(€)  =  x,(0)  +  £X      J'^-''      X,  +  O(e^) 


The  pair  (X^.y^)  is  an  eigenpair  of  the  pencil  A  -  XA    '.  Thus  ylAyi^  =  X^y^A    'yj^.     From 
Lemma  3  we  have 

(1  -   7)||y,|p   <    ly[A>.,l   =    |X,|-||A-V*ir'   =    |X*I  ^   (1   +   ^)I|>'JP     . 

Thus 

(-^)'=^   ||,,||^(^):'2  (7.2) 

1    +    "Y  1    -    -V 


If  we  let  Ti  =  yt/||y*||  then 

k*i        (X,  -  Xt)/ |X,Xi 
where  (1  +  -y)"'  <   |4,i  |  s  (1  -  -y)"'.  If  we  take  norms  then 


x,(6)  =  x,(0)  +  eS  e,*7^ TTTTTT-^T  +  ^('')  (7-3) 


k(0  -  :c,(0)||  < ("      V^     .     .    -   0(c  =  ) 

(1  -  ■^)  relgap{k,) 


as  desired.  D 


Corollary  3:  Let  //(c),  X,(€),  and  x,(€)  fce  as  in  Theorem  6.  Assume  further  that  //(e)  is  -y- 
5. «/.(/.  for  all  Osese,  r/jar  l--y-3ne>0,  and 

3-2~"^-n-€ 
relgap{\,)  &  ^    . 

1  — -y  —  3ne 

lk(^-x,(0)||  s   ^^^^^^^^^ ^^^^-^    . 

2i\-y-3nT)-(relgapi\,)  -  ^ ^) 

1  — -y  —  3/i€ 

Proof:  The  idea  is  that  if  e  is  small  enough,  the  Xi(€)  can  only  change  by  a  small  relative 
amount,  so  the  relative  gap  can  only  change  by  a  small  absolute  amount.  From  Proposition  4, 
we  can  bound  the  perturbed  relative  gap  from  below  as  follows: 

|x,(e)-x,(0l  ix.(0)-x.(0)l--^(|x,(0)KM0)|) 

relgap(\X^))  =  min- —7-  ^  min 

*^-  |x,(€)x, (€)!>-       ^*.  |x,(o)x,(o)|(i--^^)-" 

l--y 
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nf       l>^,(0)|    ♦     |X,(0)1 
=   (l-T^^)min[(rWga/p(\,(0),X.(0))  - 
1--Y     *-, 


1-7       |\,(0)\,(0)|'- 


whererW«ap(X,(0),Xt(0))  =    |X,(0)- \i(0)  |- |X,(0)X»(0)  |  "'^ 

We  consider  two  cases,  relgap{\X0),\k(0))^2-^'' ,  and  re/^ap(X,(0),X*(0))£2-"^  The 
first  case  corresponds  to  X,(0)  and  Xt(0)  differing  by  at  least  a  factor  of  2,  whence 

|X,(0)|  +    |X*(0) 


min- 


|X,(0)X*(0) 


ll  - 


<  3Te/gap(X,(0),X;(0)) 


and 


re/^ap(X,(e),X,(€))  s  (1- 7^^)T«/sap  (X,(0),Xt(0))-(l  -  ^^)    . 


1-7'        °   ' 1-7 

The  second  case  corresponds  to  X,(0)  and  Xt(0)  differing  by  at  most  a  factor  of  2,  whence 

.     |X,(0)|  +    |Xt(0)l 


min- 


|X,(0)Xt(0) 


11 '2 


s  3-2 


-1'2 


and 


relgapi\X^),\,(e))  ^  (1- 7^^)-(rW«ap  (X,(0),X,(0))  -    ^'^^    ' '""  )    . 


Altogether,  we  have 


relgap{\,(^))^  (1--^^)(1- 


1-7 
3ne 


)irelgap(\XO)) 


1-7 


_    (3-2-'^^n€)/(l-7) 


(7.4) 


l-y  '   '-       1-7  '^      o   r  V    ,v    .,  1  -   3n6/(l-7)     ' 

Now  integrate  the  bound  of  Theorem  6  from  e  =  0  to  e  =  i"  to  get  the  desired  result.  D 

The  next  theorem  shows  that  the  components  of  each  eigenvector  are  scaled  analogously 
to  the  way  the  matrix  is  scaled: 

Proposition  6:  Let  H  =  AA^  be  a  -i-s.d.d.  symmetric  matrix  with  respect  to  the  2-norm.  Let 
H=XAX  be  its  eigenvector  decomposition,  where  X  =  [x ],  .  .  .  ,x„]  is  the  matrix  whose 
columns  are  orthonormal  eigenvectors  and  A  =  diag(Xi,  •  •  •  .^n)-  Let  x,{j)  be  the  j-th  com- 
ponent of  X,.  Then 


We  also  have 


k,0-)l  ^  ?-0)- 


,0) 


1  +  7 


u-^j 


3/2 


1  +  7 


1-7 


3/2 


•min( 


1/2 


) 


1  +  ' 


1-7 


1/2 


Proof:   Let  y,  =  Ax,    for   all    i.    First   we   consider   the   case   A„<A^^.    From    (7.2)    we    h 
Ibill  ^  (l^.|/(l-"y))''"-    Thus,  applying  Proposition  2  as  well, 

kO-)l  =   A-My,C;-)|  ^  A-'(|X,|/(l-7))"2  < 
as  desired.  We  may  also  write 


ave 


kO)|  =  A-'l>,0)l  ^  A,;-'(|X,|/(l-7))"2  ^  I~"(t^) 


1/2 


to  get  the  other  inequality. 
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No^A■  consider  the  case  A„aA  .  We  will  take  the  ;-th  components  of  both  sides  of  the 
equalit>'  Ay,  =  X,A"->-,,  and  bound  them  as  follows.  The  left  hand  side  component  is 
bounded  above  in  absolute  value  b\' 


(l-7)ll>',||  ^  (1--y)(1-7) 


■1  ; 


The  right  hand  side  component  is  bounded  below  in  absolute  value  by 

\\^^Jr\y,{j)\  ^  (i--y)|x,/x.,Ny,a)l  • 

Thus 


U,{j)\  =  Aj7ly.O) 


1         (l+7)l^;l 

"  (i->)'-|x,|' 


1  +  7 

1-7 


3  7 


X, 


as  desired.  The  bound  in  terms  of  Aj,/A„  is  obtained  similarly.  O 

Thus,  if  a  matrix  is  strongly  scaled  (the  Ajj  vary  greatly  in  magnitude),  the  eigenvectors 
will  be  strongly  scaled,  and  small  relative  perturbations  in  the  matrix  entries  will  not  be  able 
to  change  the  smaller  eigenvector  components  much.  In  the  next  theorem,  we  prove  some- 
thing even  stronger:  the  perturbations  in  the  eigenvector  components  x,(J)  will  be  small  com- 
pared to  the  upper  bounds  x,(J): 

Theorem  7:  Let  //(e)  be  as  in  Theorem  6.  Let  xX^)(J)  (denote  the  j-th  component  of  the  i-th  unit 
eigenvector  of  H(e).  Let  x,(J)  be  the  upper  bound  for  the  j-th  component  of  the  unperturbed 
unit  eigenvector  x,(0)(7).  Then 


k(€)0)  -  ^,(0)0)1  ^ 


-—•1,0)    +    0(6^) 


{\-y)mm{relgap{\,)  ,  2""-) 

(X,  is  the  same  as  X,(0)J.  Note  that  relgap  {\,)  exceeds  2~'  '  only  when  X,  differs  from  its  nearest 
neighbor  by  a  factor  greater  than  2  or  less  than  112. 

Proof:  We  start  from  (7.3)  in  the  proof  of  Theorem  6;  it  implies 


k(O(j-)-^,(0)0)l  =  k  S 


3:=    |x,x,|>- 


x,{0)U)  +    0(£^)| 


e(n-l)(l  +  7) 
(1-7)^- 

€(>l-l)(l  +  >)3^^ 
(1-7)^'^ 


|X,-Xi 
X, 


•min( 


1  : 


'.{n-\){\  +  'i) 


"ill 


(l->) 


5/2 


min( 
min( 
min( 


|x,-xj  ' 


1/2 


^, 


^. 


)    +    0(€=) 

_IM_ 
Ix.-XiK 


X, 

1/2 

> 

X, 

1/2 


1/2 


max(|X, 

1  .  |X* 

1) 

|X,- 

■x,| 

m\x\{relgap{K,)  ,  2""^) 


as  desired,  d 


A  version  of  Theorem  7  for  nonasymptotically  small  e  can  be  proven  in  the  same  way 
as  Corollary  3  was  derived  from  Theorem  6. 

If  we  have  a  cluster  of  eigenvalues  which  are  relatively  well  separated  from  the  others, 
similar  analyses  to  those  above  show  that  the  invariant  subspace  they  span  is  insensitive  to 
perturbations,  even  if  the  individual  eigenvectors  are  sensitive. 
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Now  consider  s.d.d.  definite  pencils.  From  Proposition  5  of  the  last  section,  we  know 
we  can  often  reduce  such  pencils  to  standard  form  without  sacrificing  diagonal  dominance: 
Given  H-kM,  M  positive  definite  with  lower  triangular  Cholesky  factor  L,  the  matrix 
Y  =  L~^HL  "^  has  the  same  spectrum  as  H  -\M .  Also,  if  v  is  an  eigenvector  of  Y,  x  =L~^y  is 
an  eigenvector  of  H  -  XM;  thus  Theorems  6  and  7  can  be  used  to  derive  perturbation  bounds 
on  X,  although  we  will  not  do  so  here. 

Finally,  we  consider  perturbation  theory  for  the  singular  vectors  of  bidiagonal  matrices. 
Let 

02     *2 


B  = 


(7.5) 


be  bidiagonal,  where  as  in  Theorem  1  we  may  assume  without  loss  of  generality  that  all  a, 
and  b,  are  positive.  Recall  that  the  left  singular  vectors  of  B  are  the  eigenvectors  of  BB^  and 
the  right  singular  vectors  of  B  are  the  eigenvectors  of  B^B.  Since 
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small  relative  perturbations  in  the  a,  and  b,  only  cause  small  relative  perturbations  in  the 
entries  of  BB^  and  B^B.  Therefore,  we  can  reduce  perturbation  theory  for  the  singular  vec- 
tors of  a  bidiagonal  matrix  to  perturbation  theory  for  the  tridiagonal  matrices  BB^  and  B^B. 
Proposition  7:  Let  B  be  as  in  (7.5).  Since  BB'''  and  B'^B  are  positive  definite  tridiagonal,  and 
hence  s.d.d.,  small  relative  changes  in  the  entries  of  B  cause  perturbations  in  the  singular  vec- 
tors as  described  by  Theorems  6  and  7.  More  specifically,  let  Di  =  diag((flB '')„)  and 
Dr  =  diag((fi'"B)„).  Then 


\DZ^'^BB'^DZ^'^  -  /||  <  ^i  =  2max  (max 


bjOj^] 


b„-, 


;<-.-.    (aj  +  bjyHal 


+  bj.,r' 
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||Dj"2fi^BD 


■1/2 


"J^J 


~  l\\  ^  "Y/t  =  2max  (max  — ; , 


nr) 


iai+b]y'' 

Both  7i  and  ^k  are  bounded  by  2.    Ifaj>3^'^bj.  then  yi  <  1,  and  if  aj>3^'^bj-i.  then  y^  <  1- 
Proof:  A  simple  computation  shows  that  the  diagonal  of  D[^'^BB'^D[^'^  consists  of  ones  and 
that      the      offdiagonals       are      Vj-iC''; +^--i)""^(a^i +^^)""^       for      j<n-l       and 
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fc„_i-(a2_,+i,^_,)-i;:  fory  =  n-l.    Thus  yi  bounds  the  1-norm  of  D^  "^fifl^Dz,  "^  -  /.  A 
similar  computation  applies  to  fi^B.  n 

Thus,  the  sensitivity  of  the  singular  vectors  to  relative  perturbations  in  the  entries  of  B 
is  governed  by  the  relative  gap  between  singular  values,  as  conjectured  in  [6].  Actually, 
more  was  conjectured:  it  does  not  appear  that  the  measure  of  diagonal  dominance  y  of  BB^ 
or  B^B  affects  the  singular  vector  sensitivity.  Proposition  7  supports  this  stronger  conjecture 
for  the  following  reason:  In  [6]  it  was  shown  that  bidiagonal  QR  iteration  with  a  zero-shift 
was  componentwise  forward  stable.  In  other  words,  small  relative  changes  in  the  bidiagonal 
entries  only  cause  small  relative  changes  in  the  Givens  rotations  which  implement  the  OR 
iteration  as  well  as  small  relative  changes  in  the  transformed  bidiagonal  matrix.  As  the  QR 
iteration  converges,  B  becomes  more  nearly  diagonal  and  eventually  the  scaled  diagonal  dom- 
inance measure  -y  approaches  zero.  If  the  number  of  QR  steps  to  make  -y  small  is  not  too 
large.  Proposition  7  and  forward  stability  together  imply  that  the  relative  gap  alone  governs 
the  singular  vector  sensitivity.  Even  though  the  required  number  of  QR  steps  seems  hard  to 
estimate  a  priori,  it  is  clearly  available  from  running  the  algorithm  so  that  error  estimates 
may  be  obtained  in  practice. 


8.  On  Condition  Numbers  and  the  Distance  to  the  Nearest  Ill-Posed  Problem 

In  [5]  it  was  observed  that  a  common  feature  of  many  numerical  analysis  problems  is 
that  their  condition  numbers  approximate  or  at  least  bound  the  reciprocal  of  the  distance  to 
the  nearest  ill-posed  problem,  i.e.  problem  whose  condition  number  is  infinite.  The  classical 
example  of  this  is  matrix  inversion,  where  the  condition  number  of  a  matrix  T  is 
k(7")  =  ||r||- 1|7~'  ||.  By  scaling  T  we  may  assume  without  loss  of  generality  that  ||r||=  1,  so 
that  k(7")  =  ||r~'  ||.  But  the  distance  (in  the  ||-{|  norm)  from  T  to  the  nearest  singular  matrix 
(nearest  matrix  whose  condition  number  is  infinite)  is  HT""'  ||  =  1/K(r).  Thus,  the  condition 
number  is  exactly  the  reciprocal  of  the  distance  to  the  nearest  ill-posed  problem.  This 
phenomenon  recurs  throughout  numerical  analysis,  although  we  usually  get  a  somewhat 
weaker  relationship,  such  as  one  sided  bounds  between  the  condition  number  and  reciprocal 
distance. 

Here  we  investigate  this  phenomenon  in  the  case  of  finding  the  eigenvectors  of  a  sym- 
metric matrix.  From  (1.3),  we  see  that  l/gap{k,)  is  a  condition  number  for  the  j-th  eigenvec- 
tor of  a  general  symmetric  matrix.  This  condition  number  is  infinite  precisely  when 
gap(\i)  =  0,  i.e.  X,  is  a  multiple  eigenvalue.  It  is  reasonable  to  call  such  an  eigenproblem  ill- 
posed  because  the  eigenvector  is  no  longer  uniquely  determined:  any  vector  in  an  at  least 
two-dimensional  invariant  subspace  will  do.  It  is  elementary  to  show  that  the  reciprocal  of 
this  condition  number  gives  exactly  the  distance  to  the  nearest  ill-posed  problem: 

Proposition  8:  Let  H  be  a  symmetric  matrix  with  simple  eigenvalue  X,;  thus 
gap{\,)  =  min  |X,-Xi|  >  0.     Then   the   smallest    \\hH  \\   such   that  the   eigenvalue  of  H +hH 

"corresponding  to"  X,  is  multiple  is 

mm  \\hH\\  = . 

By  "the  eigenvalue  corresponding  to  X,  is  multiple"  we  mean  that  if  the  continuous  function  X/(^) 
is  an  eigenvalue  of  H  +  ^hH  for  all  0<|<1,  with  X,(0)  =  X„  then  X,(^)  is  simple  for  0^i<\ 
and  X,(l)  is  multiple. 

Proof:  Suppose  |Xj-X,|  =  gap{\i)  and  that  x,  and  Xj  are  corresponding  unit  eigenvectors.  Let 
hH  =  .S-{\j-\i){xix]  -  Xjx])  to  show  min  ||BH||  s  gap{\,)l2.  To  get  the  other  inequality 
apply  (1.1)  to  see  that  any  smaller  \\hH\\  could  not  move  either  X,  or  X^  more  than  half  the 
distance  towards  one  another,  n 
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It  turns  out  that  a  similar  relationship  holds  for  7-s.d.d.  symmetric  matrices,  provided 
we  measure  distances  in  the  scaled  way  used  so  far  in  this  paper,  and  that  we  use 
\lrelgap{\,)  as  a  condition  number.  This  is  interesting  because  it  extends  work  in  [5]  to  a 
case  where  the  distance  metric  is  quite  skewed  from  the  usual  norm,  and  shows  that 
l/relgap{\,)  is  the  most  natural  condition  number  for  this  problem,  because  it  shares  the 
same  geometric  properties  as  other  condition  numbers. 

Proposition  9:  Let  H  =  AAA  be  an  n  by  n  -^-s.d.d.  symmetric  matrix  with  simple  eigenvalue  X,; 
thus    relgap{X,)  =  min   |X,- Xj  |  ■  |X,Xt  |~' -  >  0.     Assume  further    that    re/gap  (X,)<2~ ''    (this 

means  that  X,  and  its  nearest  neighbor  differ  by  a  factor  between   .5  and  2).  Then  the  smallest 
\\bA  II  such  that  the  eigenvalue  of  A  {A  +5A)A  "corresponding  to"  X,  is  multiple  satisfies 


_2l'^_-3-n_  .  ^  3  ^  ^^    2"^-3-n     ^  ^y^  _  2'^^-12-n(l- ^)  j, 


relgap(\,) relgap(\,)  relgap{\,) 

2''-6n 
relgap{k,) 


|5A 


£  relgapi\,)-r"-n 


For  relgap(\,)«l,  the  lower  bound  on  min  ||Sy4  ||  behaves  like  relgap  (\,)(1- y)/(2^'^  ■3-n). 

Proof:  By  scaling  H  we  may  assume  without  loss  of  generality  that  A„  =  1.    Let  X^  satisfy 
relgap(k,)  =    |X,-  Xj  \-  |X,X^  |'  -,  and  let  x,  and  Xj  be  corresponding  unit  eigenvectors. 

First  we  prove  the  upper  bound  on  min||5A||.  From  Proposition  6  we  have  that 
\ix,x!)i,,\^  {l  +  y)\l-y)-^A,A,.  Thus  jjA -'x,xrA- '  jj  s  n(l  +  >)3(l- ^)-5.  Let 

f)A  =  {\j-\,)x,xj.  Clearly  A(A  +bA)A  has  a  multiple  eigenvalue  at  \j  as  desired.  Also, 

|X,-X,|  =  rWgap(X,)-|X,X;r-  ^  2''~relgapik.)-\\,\  ^  2'- ■{!  + y)relgapi\,)    , 

which  when  combined  with  the  bound  on  ||A"'x,xfA"'  ||,  yields  the  desired  upper  bound. 

Now  we  consider  the  lower  bound.  Abbreviate  \\bA  \\  by  €.  Note  that  if  H  is  >-s.d.d., 
then  H  +  AhAA  is  (7  +  2€)(l-6)''-s.d.d.,  if  {y  +  2€)il-t)-^  <  1.  From  (7.4)  in  Corollary  3, 
we  see  that  if  (-y  +  2€)(l-€)"'<l,  then  the  perturbed  relative  gap  will  be  at  least 

3-2''-ne 


relgap{\,)  - 


1  -  (7  +  2€)(l-e)- 


In  order  for  the  perturbed  relative  gap  to  be  zero,  this  lower  bound  will  have  to  be  nonposi- 
tive,  and  so 

relgap(\,) 
€^     .'.         •(!  -  (7  +  20(l-€)->) 
3-2   --n 

Solving  for  the  smallest  c  satisfying  this  inequality  yields  the  desired  lower  bound.  When 
relgap{k,)«l  so  that  c  is  small  enough  that  (-y  +  2€)(l-€)~'=:-y,  we  get 
€^relgap{\,){l-y)/(3-2^'^-n)  as  desired.  D 

The  same  results  could  have  been  obtained  using  the  general  machinery  of  differential 
inequalities  in  [5],  but  these  proofs  are  more  straightforward. 


22 


9.  Algorithms  for  the  Bidiagonal  Singular  Value  Decomposition 

In  this  section  we  discuss  algorithms  capable  of  attaining  the  high  relative  accuracy 
inherent  in  the  data  as  described  in  Theorem  1.  Most  of  this  work  has  appeared  elsewhere, 
but  since  we  will  need  the  results  in  the  next  section,  we  outline  them  here. 

The  three  classes  of  algorithms  we  will  discuss  are  QR,  bisection,  and  divide  and  con- 
quer. The  standard  QR  iteration  [8]  as  implemented  in  LINPACK  [14]  does  not  compute  all 
singular  values  to  high  relative  precision.  It  may  be  modified,  however,  to  achieve  this  as 
described  in  [6].  Briefly,  the  idea  is  to  use  a  zero  shift  in  a  QR  sweep  when  a  tiny  singular 
value  is  present.  It  turns  out  this  zero-shift  QR  can  be  implemented  in  a  forward  stable  way 
that  only  introduces  small  relative  errors  in  each  entry  of  the  bidiagonal  matrix.  Corollary  1 
of  Theorem  1  then  guarantees  the  singular  values  are  not  changed  significantly.  The  standard 
convergence  criterion  must  also  be  changed  to  guarantee  high  relative  accuracy;  see  [6]  for 
details.  The  resulting  algorithm  is  not  only  more  accurate  than  the  standard  implementation 
but  faster  on  the  average;  this  is  because  the  zero-shift  QR  sweep  contains  significantly  fewer 
floating  point  operations  than  shifted  QR. 

It  was  conjectured  in  [6]  that  this  modified  QR  algorithm  computes  the  singular  vectors 
as  accurately  as  the  "relative  gap"  error  bounds  of  section  7  permit.  Proposition  7  strongly 
supports  but  does  not  prove  this  conjecture  in  all  cases. 

Bisection  is  another  method  that  guarantees  high  relative  accuracy.  An  error  analysis  of 
the  Sturm  sequence  recurrence  for  counting  the  number  of  singular  values  of  a  bidiagonal 
matrix  in  an  interval  [6]  shows  that  it  computes  the  exact  number  of  singular  values  for  a 
matrix  differing  from  the  original  one  only  by  small  relative  perturbations  in  each  entry; 
Corollary  1  of  Theorem  1  then  guarantees  high  relative  accuracy  again. 

Divide  and  conquer  [9]  has  not  yet  been  shown  to  achieve  high  relative  accuracy,  at 
least  without  resorting  to  extended  precision  arithmetic  in  the  inner  loop.  Achieving  this  accu- 
racy is  a  current  area  of  research. 

10.  Algorithms  for  the  Symmetric  Tridiagonal  Eigenproblem 

In  this  section  we  present  algorithms  for  computing  eigenvalues  of  -y-s.d.d.  symmetric 
tridiagonal  matrices  to  high  relative  accuracy.  We  note  that  reducing  a  dense  -y-s.d.d.  sym- 
metric matrix  to  tridiagonal  form  will  not  generally  preserve  diagonal  dominance  or  the  accu- 
racy of  the  eigenvalues;  thus  the  algorithms  in  this  section  are  suitable  only  when  the  original 
matrix  is  tridiagonal.  If  the  original  matrix  is  dense,  the  algorithm  in  the  next  section  should 
be  used. 

Briefly,  bisection  can  always  be  used  to  find  the  eigenvalues  accurately.  If  the  matrix  is 
positive  definite  as  well,  Cholesky  followed  by  the  algorithm  of  the  last  section  applied  to  the 
bidiagonal  Cholesky  factor  can  be  used.  QR  does  not  seem  to  work  in  general,  but  may  if  the 
matrix  is  strongly  diagonally  dominant  and  monotonically  graded.  It  is  still  an  open  question 
whether  divide  and  conquer  techniques  [4,  7]  can  achieve  high  relative  accuracy. 

As  with  the  bidiagonal  singular  value  problem,  the  standard  implementation  of  QR  for 
tridiagonal  matrices  does  not  guarantee  high  relative  accuracy  in  the  computed  eigenvalues 
for  symmetric  -y-s.d.d.  matrices,  even  if  the  change  in  convergence  criterion  suggested  in  sec- 
tion 5  is  adopted.  Even  if  a  zero-shifted  QR  algorithm  like  the  one  used  for  the  bidiagonal 
singular  value  decomposition  is  used,  relative  accuracy  is  lost  (in  numerical  experiments  on  a 
strongly  s.d.d.  positive  definite  matrix,  negative  eigenvalues  were  computed). 

However,  recent  work  by  Le  and  Parlett  [13]  gives  some  hope  that  zero-shifted  tridiago- 
nal QR  may  sometimes  be  used  in  the  same  way  as  the  bidiagonal  zero-shifted  QR  to  com- 
pute all  eigenvalues  to  high  relative  accuracy.  Their  work  shows  that  the  inner  loop  of  the 
standard  QR  iteration  may  be  modified  to  provide  componentwise  relative  stability  in  the 
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following  sense:  in  floating  point  this  modified  zero-shifted  QR  is  equivalent  to  making  small 
relative  perturbations  in  each  entry  of  the  tridiagonal  matrix,  performing  QR  exactly  on  this 
perturbed  matrix,  and  again  making  small  relative  perturbations  in  each  entry  of  the  resulting 
matrix.  This  is  a  much  stronger  kind  of  stability  than  the  usual  kind  described  in  paragraph 
(1.2)  of  the  introduction.  If  the  original  and  final  tridiagonals  are  also  >-s.d.d.,  Proposition  4 
would  imply  that  their  eigenvalues  agree  to  high  relative  accuracy.  Thus,  QR,  combined  with 
the  stopping  criterion  (6.4),  could  be  used  to  compute  all  eigenvalues  to  high  relative  accu- 
racy, but  only  if  all  the  matrices  produced  in  the  course  of  the  iteration  were  7-s.d.d.. 
Unfortunately,  numerical  experiments  show  this  is  not  the  case  in  general,  and  may  only  be 
true  if  the  original  matrix  is  very  strongly  diagonally  dominant  and  monotonically  scaled 
{H„^H,^i  ,^1).  Thus,  we  do  not  expect  to  be  able  to  generally  use  QR  based  algorithms  for 
the  -y-s.d.d.  tridiagonal  eigenproblem. 

If  we  limit  ourselves  to  positive  definite  matrices  T,  the  following  QR  based  approach 
will  work.  Recall  that  a  tridiagonal  matrix  T  is  positive  definite  if  and  only  if  it  has  a  positive 
diagonal  and  is  "y-s.d.d.  for  some  y<l. 

Algorithm  2:  Computing  the  eigenvalues  of  a  positive  definite  tridiagonal  matrix  T: 

1)  Compute  the  Cholesky  factorization  LL^=  T  of  T. 

2)  Find  the  singular  values  of  the  bidiagonal  matrix  L  using  the  bidiagonal  QR  algorithm 
of  section  9. 

3)  Square  the  singular  values  of  L  to  get  the  eigenvalues  of  T. 

To  show  that  this  method  is  viable,  one  needs  to  show  that  scaled  diagonal  dominance  is 
sufficient  to  guarantee  that  the  squares  of  the  exact  singular  values  of  L  are  all  relatively 
close  to  the  eigenvalues  of  A;  the  algorithms  of  section  9  then  guarantee  that  we  can  compute 
the  singular  values  of  L  to  high  relative  accuracy. 

To  this  end  we  present  a  backward  error  analysis  of  the  Cholesky  decomposition  of  a 
positive  definite  symmetric  tridiagonal  matrix  A.  Our  goal  is  to  show  that  the  computed 
Cholesky  factor  L  is  a  small  componentwise  relative  perturbation  of  the  exact  Cholesky  factor 
of  a  small  componentwise  relative  perturbation  of  A.  We  assume  the  usual  model  of  floating 
point  arithmetic //(a  op  b)  =  {a  op  b)(l+e),  where  |e  |^€  and  op  €  {+,-,x,/},  and  that 
the  floating  point  square  root  function  j^rr  satisfies  j(7rr(a)  =  (1+ e)a'-  where  |e  1^€. 

Proposition  10:  Let  A  be  an  n  by  n  positive  definite  symmetric  tridiagonal  matrix  with  diagonal 
entries  a\,  .  .  .  ,a„  and  offdiagonal  entries  b\,  .  .  .  ,bn-\.  Let  L  be  the  computed  Cholesky  fac- 
tor from  the  following  algorithm: 

III  =  s^rt(ai) 
for  1  =  1  to  n  —  I 

'i  +  l.i  =   b,fl,i 

'1  +  1. i^i  =  ■s«'-/(a, +  1-/7.1,,) 
endfor 

Then  barring  over/ underflow  and  attempts  to  take  square  roots  of  negative  arguments,^!-,  -  L  +  hL, 
|6L  I  £  5.5€|Z,|,  where  L  is  the  exact  Cholesky  factor  of  the  matrix  A  +hA  =  LL  ,  where 
\hA  I  S  3£  |i4  |,  and  the  diagonal  of  8A  is  zero. 

Proof:  We  construct  hA  and  L  as  follows.  Subscripted  es  denote  independent  quantities 
bounded  in  norm  by  e.  Then 

/„   =  fl{sqrt{a,))  =   (l  +  e,i)ai- 
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=   (l  +  €,4)-((l^£,3)(«--i-(l-«.:)'r-i,.))" 

=  (l+1.5-€,5)-(a,-,-(l  + «,:)(!+ e,i)'i'://7,)"' 
Let  /„  =  /„/(!- 1.5£;-,, 5),  with  £0.5  =  2£,i/3.  Then 

/,,,,,.,  =  (a,-,  -  (l  +  £,2)(l+e„)=fc:/(/:,(l+1.5£,_,.5)-))"2 

=  (a,.,  -  (l  +  3£,6)-*r/C,)" 

=  (a,^i   -  b,  //„)    '     . 

Let  /, .]  ,  =  b,/l„.  Then  L  is  the  exact  Cholesky  factor  of  A^hA  where  the  diagonal  of  6A  is 
zero  and  |8A,,,^i  |  =    ISe.gfc,  |  <  3£|fc,  ].    Therefore  |6A  |  £  3£  |A  |  as  desired. 

Note  that  |/„-/„|  ^  1.5£|/„|.    Since 
;  b.         (l  +  3£,fe)(l+1.5£,-i,5)fc.         (l  +  3£„)(l  +  1.5£,-i,5)  _  n  +  S  S.    V 

;,.!,,  =  —  = =  r— ''-1.'  -  (i+5.5£,7)Ui., 

A  A 

we  have  \L-L\  ^  5.5£  |L  |  as  desired,  n 

Theorem  8:  Lei  A  be  an  n  by  n  positive  definite  symmetric  tridiagonal  matrix,  and  L  its  com- 
puted Cholesky  factor  as  in  Proposition  10.  Let  -/<!  be  the  scaled  diagonal  dominance  of  A. 
Let  <j^-&   ■  ■  ■  -^Un  be  the  singular  values  of  L  and  X,  S  •  •  •  s  X„  fee  r/ie  eigenvalues  of  A.  Then 

(l--^)(l-5.5£r-2  ^  ^4Pr  ^  (l+-^)(l-5.5£)-^''-2    . 

When  £«1.  the  upper  and  lower  bounds  are  approximately  1±  [- 1-  (22n  -  11)]€. 

Proof:  Combine  Corollary  1  of  Theorem  1,  Theorem  2  and  Proposition  10.  □ 

It  is  easy  to  find  T  for  which  Algorithm  2  computes  all  the  eigenvalues  accurately,  but 
where  the  standard  QR  iteration  [15]  loses  all  accuracy  on  the  smallest  eigenvalues. 

Provided  that  the  bidiagonal  SVD  algorithm  of  section  9  can  compute  the  singular  vec- 
tors as  accurately  as  the  "relative  gap"  error  bound  permits  (see  the  discussion  of  section  9), 
Algorithm  2  will  compute  the  eigenvectors  of  T  as  accurately  as  Theorems  6  and  7  permit. 

Bisection  is  a  viable  algorithm  for  all  s.d.d.  tridiagonal  matrices.  A  similar  error 
analysis  to  the  one  for  bidiagonal  Sturm  sequences  shows  that  Sturm  sequences  can  find  accu- 
rate eigenvalues  of  7" -1-87"  where  hT  causes  only  small  relative  perturbations  in  each  entry  of 
T  [10].  No  pivoting  is  required,  so  Sturm  sequence  evaluation  can  be  done  in  linear  time  with 
no  storage  beyond  that  needed  for  T.  Together  with  Theorems  2  or  4,  this  implies  that  the 
computed  eigenvalues  all  have  high  relative  accuracy  if  the  matrix  is  symmetric  tridiagonal 
•y-s.d.d. 

As  in  the  bidiagonal  case,  the  ability  of  divide  and  conquer  algorithms  [7]  to  achieve 
high  relative  accuracy  is  an  open  problem. 
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11.  Algorithms  for  the  Dense  Symmetric  Eigenproblem 

First  we  present  a  new  algorithm  (or  rather  a  new  analysis  of  an  old  algorithm)  for 
finding  accurate  eigenvalues  of  (possibly  dense)  symmetric  s.d.d.  matrices.  The  algorithm  is 
based  on  bisection,  but  rather  than  computing  the  LDU  factorization  of  H  -x!  and  using  the 
number  of  negative  D„  to  compute  the  number  of  eigenvalues  of  H  less  than  x,  we  compute 
the  LDL^  factorization  of  A  -xA"^,  where  //  =  AAA. 

Second,  we  show  that  a  suitable  variation  of  inverse  iteration  can  compute  the  eigenvec- 
tors of  a  symmetric  s.d.d.  matrix  to  the  limiting  accuracy  of  the  "relative  gap"  error  bounds 
in  Theorems  6  and  7. 

Algorithm   3:  Stably  computing  the  inertia  of  a  shifted  y-scaled  diagonally  dominant 
symmetric  matrix  H  —  xI: 

(0)  We  assume  as  before  that  A„=  ±1.  We  consider  only  x>0;  for  x<0  consider 
-H  -xl. 

(1)  Permute  the  rows  and  columns  of  A  -xA~*  and  partition  it  as 

Aii-xAf'  A 12 

A21  A;2  -xAf 

so  that  if  a- xd~'  is  a  diagonal  entry  ofAu-xAf',  then  either  a  =  -1  or  a  =  \ 
and  xd~^3i2. 

(2)  Compute  a:  =  A22-'A2"- -A:i(A  ,, -xAr^)"'A  ,;. 

(3)  Compute  inertia(X)=  (n  ,  z  ,  p)  using  a  stable  pivoting  scheme  such  as  in  [3]. 
Here  n  is  the  number  of  negative  eigenvalues,  z  the  number  of  zero  eigen- 
values, and  p  the  number  of  positive  eigenvalues  of  X. 

(4)  The  inertia  of  W  -x/  is  (n-i-  dim(A  u)  ,  z  ,  p). 

Theorem  9:  Let  //  =  AAA  be  a  y-scaled  diagonally  dominant  symmetric  matrix  and  x>0  a  real 
scalar.  Algorithm  3  computes  the  exact  inertia  of  a  matrix  H  ^  hH  - xl ,  where  6//  =  A5AA, 
||5A  ||  =  0(€),  £  being  the  machine  precision.  Thus,  Algorithm  3  can  be  used  in  a  bisection  algo- 
rithm to  find  all  the  eigenvalues  of  H  to  high  relative  accuracy. 

Proof:  The  partitioning  guarantees  that  the  diagonal  entries  of  Au-xAf*  are  all  less  than  or 
equal  to  -1.  Therefore,  all  the  eigenvalues  of  An-xAi^  are  less  than  or  equal  to  -l-t-'y. 
Since  X  is  defined  so  that 


A 

u-'^\ 

2 

A 

12 

>V21 

A 

22- 

xAi 

I 


0 


A:,(A„-xAr2)-'    / 


•xa; 


0 


/    (A„-xAr^)-'A,2 
0  / 


the  inertia  of  A  -xA    ^  is  equal  to 


inertia(A-xA    ^)  =  inertia(A:)  -t-  inertia(A  n-xAj  ^)  =  inertia(X)  +  (dim(A  i,),0,0) 

by  Sylvester's  Theorem.  The  algorithm  in  [3]  will  compute  the  exact  inertia  of  X-tbX,  where 
||5A'||=0(e)||A:||.  Thus  if  we  show  that  ||X  ||  is  of  order  l-y  s  IIA22II  ^  1+7,  we  will  be 
done.  To  this  end  we  note  that  by  construction    ||xA2'^||^2,  an,i„(Aii— Jc^f^)^!-"/.   and 

l|A,2||=||A2,  11^7.  Thus 

ll^il  <  1^-,  +  2  + 


1-7         l-y 


as  desired.  □ 
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In  the  case  of  pencils,  there  is  an  analogous  algorithm.  If 
H  —  \M  =  A//A/yA//- XA',;,4;./A.\f  is  a  7-s.d.d.  definite  pencil,  we  compute  the  i.DZ.^  decompo- 
sition of  /i;/  — atA/^  '  A './/^ './A '.fA/y'  in  order  to  count  the  number  of  eigenvalues  less  than  x. 
Henceforth  we  will  assume  without  loss  of  generality  that  H-\M  =  H-kiiAA  with 
\fi„\  =  A„  =   1. 

Algorithm   4:  Stably  computing   the   inertia  of  a  shifted  y-scaled  diagonally  dominant  definite 
pencil  H  —  xM : 

(0)  As     stated     above,     we     assume     without    loss     of     generality     that    M=AAA     with 
|W„  \  =  A„  =  1.  We  also  consider  only  x>0;  for  x  <0  consider  - H  - xM . 

(1)  Permute  the  rows  and  columns  of  H  -xAAA  and  partition  it  as 

//n-*^r''4n^r'    Wiz-^Ar'/iizA:"'"! 
//21-xAf 'A^iAf'    //22-JrA2"'i4;:Af ' 

so  that  '\i  h  -  xd'-  is  a  diagonal  entry  of  Wu  -xAf '/\  jjAf ' ,  then  either  h  =  -\orh=\ 
andxJ"-  ^  ^  -  2(1-1- -y)/(l--^). 

(2)  Compute 

X  =  //2:-;rA:"'>\::A:"'  -  (W.i -*-^;"'>l;iA  r')-(Hii -xA  f'^A  ,,Ar')"'-(//i:-;cAr'A  i.A;"') 

(3)  Compute  inertia(X)  =   {n  ,  z  ,  p)  using  a  stable  pivoting  scheme  such  as  in  [3]. 

(4)  The  inertia  of  //  -  xM  is  (n  -r  dim(/l  u)  ,  z  ,  p  ). 


Theorem  10:  Let  H  —KM  =  A///i//A;^- XA  i/A  wA,\/  be  a  y-s.d.d.  definite  pencil  and  x>0  a  real 
scalar.  Algorithm  4  computes  the  exact  inertia  of  H  +  hH  —  xM,  where  bH  =  A^B^AA^, 
||S/4  II  =  0(e),  e  being  the  machine  precision.  Thus,  Algorithm  4  can  be  used  in  a  bisection 
algorithm  to  find  all  the  eigenvalues  of  H  —  \M  to  high  relative  accuracy. 

Proof:  Let  Y  =  //n  "^A  f'A  ,,  A  f  ' ,  and  define  A:  =  x"'AiW„  A, -A  „,  so  that  Y  =  xAf'ATAr'. 
Now  if  \{K)  is  any  eigenvalue  of  A',  we  have 

2 


HK)  s  X^,,,(x-'A,//nA,)  -   X„,„(An)  ^  r"'(1+^)  -   (I-7) 


and   \\K    'II  ^  2/(1-7).  Thus,  Y  is  also  nonsingular,  with  all  negative  eigenvalues.  Since  X 
and  Y  are  defined  so  that 

//ii-J^Af'AiiAf '    //,2-xAf'Ai2A2~' 

H21-J^A2"'A2iAr'     //22-*A2"'A22A2"^ 

/  0 

(//2i-^A2"'A2iAr')y"'    / 


Y   0 
0   X 


I    y-'(W,2-xAr'Ai2A2"') 
0  / 


we  have 


inertia(//-xA/)  =  inertia(X)  +  inertia(y)  =  inertia(X)  -I-  (dim(A  i,),0,0) 


by  Sylvester's  Theorem.  The  algorithm  in  [3]  will  compute  the  exact  inertia  of  X  +bX,  where 

||6X||  =  C>(€)||X||.  Thus  if  we  show  that  jjX  ||  is  of  order  I-7  <   ll/^::  II  ^  l  +  'C.  we  will  be 
done.    To  this  end  we  write 

ll^ll^   IIW22II+    ||xA2-'A22A2-MI  +    \\H2^y-'Hn\\+   llxAi-'Aj.Af'y-'//,.  jj 
+    HHziy^'^Ar'AnAj-'ll  +    ||xA2''A2iAr'y'';rAr'A,;A2"MI 
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=   ||//,J|  +    ||xAri/l,-,A.-M|  +    ||//.,;c"'A,/r~'A,W,:||  +    l|Ar'A:i^"'A,W,: 


+    ||W;,A,a:"'Ai2A2"'' II  +    ll*A:"'-4:!A'"Ui:A:"' II    • 
Using  the  facts  that   ||A  ,;  ||  <  -y,   IJA;,  ||  <  y,   ||//,:||  s  7,   ||//;,  j]  <  ■^,   \\K~''\\  <  2/(1--/), 
llAilHlAfMl  ^  l,x-'llAi|p  ^  ^-',  andxIlAf'll'  ^  t^,  we  get 

||x||  s  l  +  'y  +  (l  +  -y)^l  +  r(2/(l-7))pL-'  +  >=(2/(l-7))  +  -^-(2/(1-7))  +  7'(2/(l->))^t 

^  14/(1-7)' 

as  desired,  n 

Now  we  present  a  variation  on  inverse  iteration  which  can  compute  the  eigenvectors  of 
a  symmetric  s.d.d.  matrix  to  the  limiting  accuracy  of  the  "relative  gap"  error  bounds  of 
Theorems  6  and  7.  A  similar  algorithm  applies  to  pencils: 

Algorithm  5:  Inverse  iteration  for  computing  the  eigenvector  x  of  a  symmetric  s.d.d.  matrix 
//  =  AAA  corresponding  to  eigenvalue  z: 

(0)  We  assume  the  eigenvalue  z  has  been  computed  accurately  using  one  of  the  previous 
algorithms. 

(1)  Choose  a  unit  starting  vector  >q;  set  /  =  0. 

(2)  Compute  the  LDL^  factorization  of  P^(A  -zA~')P,  where  P  is  the  same  permutation  as 
in  Algorithm  3. 

(3)  Repeat 

J  =  1  +  1 

Solve  (A  —zA~')y,  =  y,-]  for  y,  using  the  LDL^  factorization  of  step  (2) 
r  =   1/||>',|| 
y,  =  ry, 
until  (r  =  0(€)) 

(4)  X  =  A-'y, 

Theorem  11:    Suppose  Algorithm  5  terminates  with  x  as  the  computed  eigenvector  of  the  sym- 
metric s.d.d.   matrix  H  =  AAA.    Then   there   is  a   diagonal  matrix  D   with   D„  =   l+0(e),    €  = 
machine  precision,  and  a  matrix  bA,    \\bA  ||  =  0(e),  such  that  Dx  is  an  exact  eigenvector  of 
A(A  +  6A)A.  Thus,  the  error  in  x  is  bounded  by  the  results  in  Theorems  6  and  7. 

Sketch  of  Proof:  Let  y,  be  the  computed  solution  of  (A  -zA"*)>,  =  >,_,  at  the  last  iteration  of 
Algorithm  5.  Applying  the  error  analysis  of  the  proof  of  Theorem  4,  one  can  show  that  there 
is  a  diagonal  matrix  D,  D„=l+0{€),  and  an  E,  \\E\\  =  0(e),  such  that 
D  {A  -zA~^  +  E)Dy,  =  y,-i.  Applying  the  result  in  [1],  we  can  assume  E  is  symmetric.  Since 
Algorithm  5  guarantees  r  =  l/lly,!!  =  0(e),  another  application  of  the  result  in  [1]  guaran- 
tees the  existence  of  a  symmetric  F,  \\F  \\  =  0{t),  such  that  (A  -zA~'^  + E  +  F)Dy,  =  0.  Thus, 
Dy,  is  an  exact  eigenvector  of  A  +£ +/"  -XA~^  for  \  =  z,  and  Dx  =  DA~'y,  is  an  exact  eigen- 
vector of  A  (A  +  8A)A,  ||8A||  =   ||£4-F||  =  0(e)  as  desired.  D 
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12.  Summary  and  Future  Work 

We  have  shown  that  there  are  a  number  of  situations  where  tiny  eigenvalues  and  singu- 
lar values  can  be  determined  much  more  accurately  than  standard  perturbation  theorems  and 
numerical  algorithms  can  guarantee.  This  is  true  for  singular  values  of  bidiagonal  matrices, 
eigenvalues  of  symmetric  s.d.d.  matrices  and  eigenvalues  of  s.d.d.  definite  pencils.  In  addi- 
tion, eigenvectors  of  symmetric  s.d.d.  matrices  corresponding  to  relatively  isolated  eigen- 
values are  determined  accurately  by  the  data.  Open  questions  remain  in  the  perturbation 
theory  for  the  singular  vectors  of  bidiagonal  matrices  and  in  perturbation  theory  for  eigen- 
values of  s.d.d.  pencils  with  positive  and  negative  eigenvalues. 

The  following  is  a  tabular  summary  of  the  current  state  of  research  into  corresponding 
high  accuracy  algorithms.  We  consider  two  classes  of  algorithms  for  eigenvalues  and  singular 
values  (bisection  and  QR),  and  two  classes  of  algorithms  for  eigenvectors  and  singular  vec- 
tors (inverse  iteration  using  accurate  eigenvalues/singular  values,  and  QR).  If  an  algorithm 
was  presented  in  earlier  research,  a  reference  is  given;  otherwise  it  was  discussed  here  for  the 
first  time.  Conjectured  algorithms  are  also  indicated.  (Divide  and  conquer  is  another  tech- 
nique for  these  problems.  Since  its  ability  to  deliver  high  accuracy  has  not  been  proven  in 
any  of  the  cases  considered  in  this  paper,  it  qualifies  as  a  "conjectured"  algorithm  for  all  of 
them.) 

Bisection  based  algorithms  for  computing  eigenvalues  and  singular  values  to  high  relative  accu- 
racy: 

1)  Singular  values  of  bidiagonal  matrices  [6]. 

2)  Eigenvalues  of  symmetric  tridiagonal  s.d.d.  matrices  (Theorem  4  and  [10]). 

3)  Eigenvalues  of  not  necessarily  tridiagonal  symmetric  s.d.d.  matrices  (Algorithm  3). 

4)  Eigenvalues  of  s.d.d.  definite  pencils  (Algorithm  4). 

QR  based  algorithms  for  computing  eigenvalues  and  singular  values  to  high  relative  accuracy: 

1)  Singular  values  of  bidiagonal  matrices  [6]. 

2)  Eigenvalues  of  symmetric  positive  definite  tridiagonal  matrices  (Algorithm  2). 

3)  Eigenvalues  of  symmetric  indefinite  tridiagonal  scaled  diagonally  dominant  matrices:  no 
QR  based  algorithm  appears  to  work  in  general. 

Inverse  Iteration  based  algorithms  for  computing  eigenvectors  and  singular  vectors  accurately: 

1)       Eigenvectors  of  symmetric  s.d.d.  matrices  and  pencils  (Algorithm  5). 

QR  based  algorithms  for  computing  eigenvectors  and  singular  vectors  accurately: 

1)  Conjectured:  singular  vectors  of  bidiagonal  matrices  (Proposition  7  and  [6]). 

2)  Conjectured:   Eigenvectors  of  symmetric  positive  definite   tridiagonal   s.d.d.   matrices 
(Algorithm  2). 

3)  Eigenvectors  of  symmetric  indefinite  tridiagonal  s.d.d.  matrices:  since  the  eigenvalues 
apparently  cannot  be  computed  accurately,  neither  can  the  eigenvectors. 

In  summary,  various  numerical  algorithms  arc  available  to  compute  eigenvalues  and 
eigenvectors,  and  singular  values  and  singular  vectors  with  high  accuracy.  Algorithm  2  for 
the  symmetric  positive  definite  tridiagonal  eigenproblem  will  be  incorporated  in  the 
LAPACK  linear  algebra  library  [12].  Not  all  algorithmic  questions  have  been  settled,  how- 
ever, cind  these  will  be  the  subject  of  future  research. 
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